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Foreword 


This book, together with Linear Algebra , constitutes a curriculum for an 
algebra program addressed to undergraduates. 

The separation of the linear algebra from the other basic algebraic 
structures fits all existing tendencies affecting undergraduate teaching, 
and I agree with these tendencies. I have made the present book self 
contained logically, but it is probably better if students take the linear 
algebra course before being introduced to the more abstract notions of 
groups, rings, and fields, and the systematic development of their basic 
abstract properties. There is of course a little overlap with the book Lin- 
ear Algebra , since I wanted to make the present book self contained. I 
define vector spaces, matrices, and linear maps and prove their basic 
properties. 

The present book could be used for a one-term course, or a year’s 
course, possibly combining it with Linear Algebra. I think it is important 
to do the field theory and the Galois theory, more important, say, than 
to do much more group theory than we have done here. There is a 
chapter on finite fields, which exhibit both features from general field 
theory, and special features due to characteristic p. Such fields have 
become important in coding theory. 

There is also a chapter on some of the group-theoretic features of 
matrix groups. Courses in linear algebra usually concentrate on the 
structure theorems, quadratic forms, Jordan form, etc. and do not have 
the time to mention, let alone emphasize, the group-theoretic aspects of 
matrix groups. I find that the basic algebra course is a good place to 
introduce students to such examples, which mix abstract group theory 
with matrix theory. The groups of matrices provide concrete examples 
for the more abstract properties of groups listed in Chapter II. 
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FOREWORD 


The construction of the real numbers by Cauchy sequences and null 
sequences has no standard place in the curriculum, depending as it does 
on mixed ideas from algebra and analysis. Again, I think it belongs in a 
basic algebra text. It illustrates considerations having to do with rings, 
and also with ordering and absolute values. The notion of completion is 
partly algebraic and partly analytic. Cauchy sequences occur in mathe- 
matics courses on analysis (integration theory for instance), and also 
number theory as in the theory of p-adic numbers or Galois groups. 

For a year’s course, I would also regard it as appropriate to introduce 
students to the general language currently in use in mathematics con- 
cerning sets and mappings, up to and including Zorn’s lemma. In this 
spirit, I have included a chapter on sets and cardinal numbers which is 
much more extensive than is the custom. One reason is that the state- 
ments proved here are not easy to find in the literature, disjoint from 
highly technical books on set theory. Thus Chapter X will provide at- 
tractive extra material if time is available. This part of the book, to- 
gether with the Appendix, and the construction of the real and complex 
numbers, also can be viewed as a short course on the naive foundations 
of the basic mathematical objects. 

If all these topics are covered, then there is enough material for a 
year’s course. Different instructors will choose different combinations ac- 
cording to their tastes. For a one-term course, I would find it appropri- 
ate to cover the book up to the chapter on field theory, or the matrix 
groups. Finite fields can be treated as optional. 

Elementary introductory texts in mathematics, like the present one, 
should be simple and always provide concrete examples together with the 
development of the abstractions (which explains using the real and com- 
plex numbers as examples before they are treated logically in the text). 
The desire to avoid encyclopedic proportions, and specialized emphasis, 
and to keep the book short explains the omission of some theorems 
which some teachers will miss and may want to include in the course. 
Exceptionally talented students can always take more advanced classes, 
and for them one can use the more comprehensive advanced texts which 
are easily available. 

New Haven , Connecticut , 1987 S. Lang 
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Foreword to the Third Edition 


In this new edition I have added new material in Chapters IV and VI, first on 
polynomials, and second on linear algebra in combination with group 
theory. The additions to Chapter VI describe various product structures for 
SL„ (Iwasawa and other decompositions). These also have to do with the 
conjugation action and the decomposition of the Lie algebra under this 
action. The algebra involved comes from deeper theories, but the parts I 
have extracted on SL n belong to an elementary level. Students are then put 
into contact with some algebra used as a backdrop for analysis on groups, 
starting with SL„. 

A new section in Chapter IV gives a complete account of the Mason- 
Stothers theorem about polynomials, with Noah Snyder’s beautifully simple 
proof. It is worth emphasizing that the derivative for polynomials is a purely 
algebraic operation, for which limits are not required. A Springer pamphlet 
has been published to present a self-contained treatment of polynomials 
(from scratch) culminating with this topic. Here it takes its place as a section 
in the general chapter on polynomials. It occurs as a natural twin for the 
section on the abc conjecture. 

I have tried on several occasions to put students in contact with genuine 
research mathematics, by selecting instances of conjectures which can be for- 
mulated in language at the level of this course. I have stated more than half 
a dozen such conjectures, of which the abc conjecture provides one spectac- 
ular example. Usually students have to wait years before they realize that 
mathematics is a live activity, sustained by its open problems. I have found 
it very effective to break down this obstacle whenever possible. 
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CHAPTER I 


The Integers 


I, §1. TERMINOLOGY OF SETS 

A collection of objects is called a set A member of this collection is also 
called an element of the set. It is useful in practice to use short symbols 
to denote certain sets. For instance, we denote by Z the set of all inte- 
gers, i.e. all numbers of the type 0, ±1, + 2, Instead of saying that x 

is an element of a set S , we shall also frequently say that x lies in 5, and 
write xeS. For instance, we have leZ, and also — 4eZ. 

If S and S' are sets, and if every element of S' is an element of 5, then 
we say that S' is a subset of S. Thus the set of positive integers 
{1, 2, 3, ...} is a subset of the set of all integers. To say that S' is a subset 
of S is to say that S' is part of S . Observe that our definition of a subset 
does not exclude the possibility that S' = S. If S' is a subset of S , but 
S' -£ S , then we shall say that S' is a proper subset of S. Thus Z is a 
subset of Z, and the set of positive integers is a proper subset of Z. To 
denote the fact that S' is a subset of 5, we write S' cz 5, and also say that 
S' is contained in S . 

If S l5 S 2 are sets, then the intersection of and S 2 , denoted by 
S { nS 2 , is the set of elements which lie in both S t and S 2 . For instance, 
if is the set of integers ^ 1 and S 2 is the set of integers ^ 1, then 


S l nS 2 = { 1} 


(the set consisting of the number 1). 

The union of S x and S 2 , denoted by S l uS 2 , is the set of elements 
which lie in S x or in S 2 . For instance, if S x is the set of integers ^ 0 
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and S 2 is the set of integers ^ 0, then S x u S 2 = Z is the set of all inte- 
gers. 

We see that certain sets consist of elements described by certain prop- 
erties. If a set has no elements, it is called the empty set. For instance, 
the set of all integers x such that x > 0 and x < 0 is empty, because 
there is no such integer x. 

If S, S' are sets, we denote by S x S' the set of all pairs (x, x') with 
xeS and x'eS'. 

We let #S denote the number of elements of a set S. If S is finite, we 
also call # S the order of S. 


I, §2. BASIC PROPERTIES 

The integers are so well known that it would be slightly tedious to axio- 
matize them immediately. Hence we shall assume that the reader is ac- 
quainted with the elementary properties of arithmetic, involving addition, 
multiplication, and inequalities, which are taught in all elementary 
schools. In the appendix and in Chapter III, the reader will see how one 
can axiomatize such rules concerning addition and multiplication. For 
the rules concerning inequalities and ordering, see Chapter IX. 

We mention explicitly one property of the integers which we take as 
an axiom concerning them, and which is called well-ordering. 

Every non-empty set of integers ^ 0 has a least element. 

(This means: If S is a non-empty set of integers ^ 0, then there exists an 
integer neS such that n ^ x for all xeS.) 

Using this well-ordering, we shall prove another property of the inte- 
gers, called induction. It occurs in several forms. 


Induction: First Form. Suppose that for each integer n ^ 1 we are 
given an assertion A(n\ and that we can prove the following two 
properties'. 

(1) The assertion ,4(1) is true. 

(2) For each integer n^. 1, if A(n) is true , then A(n 4- 1) is true. 
Then for all integers n^. 1, the assertion A(n) is true. 


Proof. Let S be the set of all positive integers n for which the asser- 
tion A(n) is false. We wish to prove that S is empty, i.e. that there is no 
element in S. Suppose there is some element in S. By well-ordering, 
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there exists a least element n 0 in S. By assumption, n 0 ^ 1, and hence 
n 0 > 1. Since n 0 is least, it follows that n 0 — 1 is not in S, in other 
words the assertion A(n 0 — 1) is true. But then by property (2), we con- 
clude that A(n 0 ) is also true because 


n o = (n 0 - 1) + 1. 


This is a contradiction, which proves what we wanted. 


Example. We wish to prove that for each integer n ^ 1, 


A(n): 


1 +2 + • - • + n = 


n(n -1-1) 
2 


This is certainly true when n = 1 , because 


,_ »(! + 1 ) 

2 

Assume that our equation is true for an integer n ^ 1. Then 


n(n -hi) , x 

1 + -- * + « + («+ 1) = j b (w + 1) 

rc(rc + 1) + 2(« + 1) 

~~2 

n 2 -f n -h In + 2 
2 

(« + 1 )(« + 2 ) 

2 

Thus we have proved the two properties (1) and (2) for the statement 
denoted by A(n + 1), and we conclude by induction that A(n) is true for 
all integers n ^ 1. 

Remark. In the statement of induction, we could replace 1 by 0 
everywhere, and the proof would go through just as well. 


The second form is a variation on the first. 
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Induction: Second Form. Suppose that for each integer n ^ 0 we are gi- 
ven an assertion A(n ), and that we can prove the following two proper- 
ties: 

( F ) The assertion /4(0) is true. 

(2') For each integer n > 0, if A(k) is true for every integer k with 
0 ^ k < n, then A(n) is true. 

Then the assertion A(n) is true for all integers n^. 0. 

Proof. Again let S be the set of integers ^ 0 for which the assertion is 
false. Suppose that S is not empty, and let n 0 be the least element of S. 
Then n 0 ^ 0 by assumption (T), and since n 0 is least, for every integer k 
with 0^k<n o , the assertion A(k) is true. By (2') we conclude that 
A(n 0 ) is true, a contradiction which proves our second form of induction. 

As an example of well ordering, we shall prove the statement known 

as the Euclidean algorithm. 


Theorem 2.1. Let m, n be integers and m > 0. Then there exist integers 
q , r with 0 ^ r < m such that 


n = qm + r. 

The integers q , r are uniquely determined by these conditions. 

Proof. The set of integers q such that qm ^ n is bounded from above 
proof 7 ), and therefore by well ordering has a largest element satisfying 

qm ^ n < {q + 1 )m = qm + m. 

Hence 

0 ^ n — qm < m. 

Let r = n — qm. Then 0 ^ r < m. This proves the existence of the integers 
q and r as desired. 

As for uniqueness, suppose that 

n = q Y m + r u 0 ^ r x < m, 

n = q 2 m + r 2 , 0 ^ r 2 < m. 

If -£ r 2 , say r 2 > r L . Subtracting, we obtain 

(</i - q 2 )m = r 2 - r v 

But r 2 — r x < m, and r 2 — r x > 0. This is impossible because q 1 — q 2 is 
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an integer, and so if (q 1 — q 2 )m > 0 then (q 1 — q 2 )m ^ m. Hence we con- 
clude that r l = r 2 . But then q l m = q 2 m, and thus q l = q 2 . This proves 
the uniqueness, and concludes the proof of our theorem. 

Remark. The result of Theorem 2.1 is nothing but an expression of 
the result of long division. We call r the remainder of the division of n 
by m. 


I, §2. EXERCISES 

1. If n, m are integers ^ 1 and n ^ m, define the binomial coefficients 


ml (n — m) 1 


As usual, n\ =n - (n - 1) - • • 1 is the product of the first n integers. We define 
0! = 1 and (^j = h Prove that 

( n 

\m — 1/ \ m ) \ m J 

2. Prove by induction that for any integers x, y we have 


(* + y)* = 




= t" + 


xy 


G) 


x 2 y n " 2 + • 




3. Prove the following statements for all positive integers: 

(a) 1 + 3 + 5 H \- (2n — \ ) — n 2 

(b) l 2 + 2 2 H f n 2 =n(n+ l)(2n + l)/6 

(c) l 3 + 2 3 + 3 3 H b n 3 = [w(#i + l)/2] 2 

4. Prove that 




('«-!)! 


5. Let x be a real number. Prove that there exists an integer q and a real number 5 
with 0 ^ 5 < 1 such that x = q + 5, and that q, s are uniquely determined. Can you 
deduce the euclidean algorithm from this result without using induction? 


I, §3. GREATEST COMMON DIVISOR 

Let n be a non-zero integer, and d a non-zero integer. We shall say that 
d divides n if there exists an integer q such that n = dq . We then write 
d\n. If m, n are non-zero integers, by a common divisor of m and n we 
mean an integer d £ 0 such that d\n and d\m. By a greatest common 
divisor or g.c.d. of m and n , we mean an integer d > 0 which is a common 
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divisor, and such that, if e is a divisor of m and n then e divides d. We 
shall see in a moment that a greatest common divisor always exists. It is 
immediately verified that a greatest common divisor is uniquely de- 
termined. We define the g.c.d. of several integers in a similar way. 

Let J be a subset of the integers. We shall say that J is an ideal if it 
has the following properties: 

The integer 0 is in J. If m, n are in J, then m + n is in J. If m is in 

J, and n is an arbitrary integer, then nm is in J. 

Example. Let be integers. Let J be the set of all integers 

which can be written in the form 


x 1 m 1 + • • • + x r m r 

with integers x 1 ,...,x r . Then it is immediately verified that J is an ideal. 
Indeed, if y u . . . ,y r are integers, then 


+ ••• + x r m r ) + ( y l m 1 H + y r m r ) 

= (*i + y 1)^1 + • • * + (X + yMr 


lies in J. If n is an integer, then 


n(x 1 m 1 + • • • + x r m r ) = nx x m x + • • • + nx r m r 

lies in J. Finally, 0 = 0 m l + * • * + 0 m r lies in J , so J is an ideal. We say 
that J is generated by m l ,...,m r and that m 1? ...,m r are generators. 

We note that {0} itself is an ideal, called the zero ideal. Also, Z is an 
ideal, called the unit ideal. 

Theorem 3.1. Let J be an ideal of Z. Then there exists an integer d 
which is a generator of J . // J # {0} then one can take d to be the 
smallest positive integer in J . 

Proof If J is the zero ideal, then 0 is a generator. Suppose J / {0}. 
If neJ then — n = (— \)n is also in J , so J contains some positive inte- 
ger. Let d be the smallest positive integer in J. We contend that d is a 
generator of J. To see this, let ne J, and write n = dq + r with 0 ^ r < d. 
Then r = n — dq is in J, and since r < d, it follows that r = 0. This 
proves that n = dq, and hence that d is a generator, as was to be shown. 

Theorem 3.2. Let m u m 2 be positive integers. Let d be a positive gener- 
ator for the ideal generated by m 1? m 2 . Then d is a greatest common 
divisor of m 1 and m 2 . 


[I, §4] 


UNIQUE FACTORIZATION 


7 


Proof. Since m l lies in the ideal generated by m u m 2 (because m l = 
1 m l + 0 m 2 ), there exists an integer q i such that 

m i = 

whence d divides m 1 . Similarly, d divides m 2 . Let e be a non-zero 
integer dividing both m l and ra 2 , say 

m 1 = h x e and m 2 = h 2 e 

with integers h u h 2 . Since d is in the ideal generated by m x and m 2 , 
there are integers s l5 s 2 such that d = s x m x + s 2 m 2 , whence 


d = + s 2 h 2 e = (s x h x + s 2 h 2 )e. 


Consequently, c divides d, and our theorem is proved. 

Remark. Exactly the same proof applies when we have more than 
two integers. For instance, if m 1? ...,m r are non-zero integers, and d is a 
positive generator for the ideal generated by m 1? ...,m r , then d is a grea- 
test common divisor of 

Integers whose greatest common divisor is 1 are said to be 

relatively prime. If that is the case, then there exist integers x l5 ...,x r 
such that 

x l m l + ■ • • + x r m r = 1, 

because 1 lies in the ideal generated by . 


I, §4. UNIQUE FACTORIZATION 

We define a prime number p to be an integer ^ 2 such that, given a fac- 
torization p = mn with positive integers m, n , then m = 1 or n = 1. The 
first few primes are 2, 3, 5, 7, 11, 

Theorem 4.1. Every positive integer n^.2 can be expressed as a prod- 
uct of prime numbers ( not necessarily distinct ), 

n = Pi ‘ ‘ ‘ Pri 

uniquely determined up to the order of the factors. 

Proof. Suppose that there is at least one integer ^ 2 which cannot be 
expressed as a product of prime numbers. Let m be the smallest such 
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integer. Then in particular m is not prime, and we can write m = de with 
integers d, e > 1. But then d and e are smaller than m, and since m was 
chosen smallest, we can write 

d = Pr--p r and e = P\'-'P' S 
with prime numbers p u . . . ,p r , p\, . . . ,p' . Thus 

m = de = Pi '"P r P\ 

is expressed as a product of prime numbers, a contradiction, which 
proves that every positive integer g 2 can be expressed as a product of 
prime numbers. 

We must now prove the uniqueness, and for this we need a lemma. 

Lemma 4.2. Let p be a prime number , and m, n non-zero integers such 

that p divides mn. Then p\m or p\n. 

Proof. Assume that p does not divide m. Then the greatest common 
divisor of p and m is 1, and there exist integers a , b such that 


1 = ap + bm. 


(We use Theorem 3.2.) Multiplying by n yields 

n = nap + bmn. 

But mn = pc for some integer c, whence 

n = ( na + bc)p , 

and p divides n , as was to be shown. 

This lemma will be applied when p divides a product of prime 
numbers gi***g s . In that case, p divides q x or p divides q 2 -"^ s . If P di- 
vides q u then p = q v Otherwise, we can proceed inductively, and we 
conclude that in any case, there exists some i such that p = q t . 

Suppose now that we have two products of primes 

Pl--Pr=Ql'~q*- 

By what we have just seen, we may renumber the primes qi,...,q s and 
then we may assume that p { = q Y . Cancelling q u we obtain 


Pl“ ‘ Pr — qi ’ ’ * Qs' 
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We may then proceed by induction to conclude that after a renumbering 
of the primes g l9 ...,g s we have r = s, and p t = q i for all i. This proves 
the desired uniqueness. 

In expressing an integer as a product of prime numbers, it is conve- 
nient to bunch together all equal factors. Thus let n be an integer > 1, 
and let p l5 ...,p r be the distinct prime numbers dividing n. Then there ex- 
ist unique integers such that n = p™ l ---p™ r . We agree to 

the usual convention that for any non-zero integer x, x° = 1. Then given 
any positive integer n , we can write n as a product of prime powers with 
distinct primes p l5 ..,,p r : 


n = p™ 1 • ■ - p™\ 

where the exponents m 1 ,...,m r are integers ^ 0, and uniquely determined. 

The set of quotients of integers m/n with n ^ 0 is called the rational 
numbers, and denoted by Q. We assume for the moment that the reader 
is familiar with Q. We show later how to construct Q from Z and how 
to prove its properties. 

Let a = m/n be a rational number, n ^ 0 and assume a ^ 0, so m 0. 
Let d be the greatest common divisor of m and n . Then we can write 
m — dm' and n — dn\ and m', n are relatively prime. Thus 



If we now express m' = p 1 / • • • p l ; and n' = q \ 1 • • • q{ s as products of prime 
powers, we obtain a factorization of a itself, and we note that no p k is 
equal to any q t . 

If a rational number is expressed in the form min where m, n are inte- 
gers, n ^ 0, and m, n are relatively prime, then we call n the denominator 
of the rational number, and m its numerator. Occasionally, by abuse of 
language, when one writes a quotient m/n where m, n are not necessarily 
relatively prime, one calls n a denominator for the fraction. 


I, §4. EXERCISES 

1. Prove that there are infinitely many prime numbers. [Hint from Euclid : Let 2, 
3, 5, ...,P be the set of primes up to P. Show that there is another prime as 
follows. Let 


N = 2- 3-5*7 --.p + 1, 

the product being taken over all primes ^ P. Show that any prime dividing 
N is not among the primes up to P.] 
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2. Define a twin prime to be a prime p such that p + 2 is also prime. For in- 
stance (3,5), (5, 7), (11, 13) are twin primes. 

(a) Write down all the twin primes less than 100. 

(b) Are there infinitely many twin primes? Use a computer to compute more 
twin primes and see if there is any regularity in their occurrence. 

3. Observe that 5 = 2 2 + 1, 17 = 4 2 + 1, 37 = 6 2 + 1 are primes. Are there infin- 
itely many primes of the form n 2 + 1 where n is a positive integer? Compute 

all the primes less than 100 which are of the form n 2 + 1. Use a computer to 
compute further primes and see if there is any regularity of occurrence for 
these primes. 

4. Start with a positive odd integer n. Then 3« + 1 is even. Divide by the 

largest power of 2 which is a factor of 3 n -hi. You obtain an odd integer n v 
Iterate these operations. In other words, form 3tty -h 1, divide by the maximal 
power of 2 which is a factor of 3 n l -h 1, and iterate again. What do you think 

happens? Try it out, starting with n = 1, n = 3, n = 5, and go up to n =41. 

You will find that at some point, for each of these values of n , the iteration 
process comes back to 1. There is a conjecture which states that the above 
iteration procedure will always yield 1, no matter what odd integer n you 
started with. For an expository article on this problem, see J. C. Lagarias, 
“The 3x -h 1 problem and its generalizations”, American Mathematical Month- 
ly , Vol. 92, No. 1, 1985. The problem is traditionally credited to Lothar 
Collatz, dating back to the 1930’s. The problem has a reputation for getting 
people to think unsuccessfully about it, to the point where someone once made 
the joke that “this problem was part of a conspiracy to slow down 
mathematical research in the U.S.”. Lagarias gives an extensive bibliography 
of papers dealing with the problem and some of its offshoots. 

Prime numbers constitute one of the oldest and deepest areas of research in 
mathematics. Fortunately, it is possible to state the greatest problem of 
mathematics in simple terms, and we shall now do so. The problem is part of the 
more general framework to describe how the primes are distributed among the 
integers. There are many refinements to this question. We start by asking 
approximately how many primes are there ^ x when x becomes arbitrarily large? 
We want first an asymptotic formula. We recall briefly a couple of definitions 
from the basic terminology of functions. Let /, g be two functions of a real 
variable and assume g positive. We say that /(x) = 0(g(x)) for x -► ot if there is a 
constant C >0 such that |/(x)| ^ Cg{x) for all x sufficiently large. We say that 
/ ( v) is asymptotic to c/(x) and we write / ~ g if 


x-x g(x) 

Let 7i(x) denote the number of primes p ^ x. At the end of the 19th century 
Hadamard and de la Vallee- Poussin proved the prime number theorem, that 

x 

7l(x) 

logx 
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Thus x/log x gives a first-order approximation to count the primes. But although 
the formula x/log x has the attractiveness of being a closed formula, and of being 
very simple, it does not give a very good approximation, and conjecturally, there 
is a much better one, as follows. 

Roughly speaking, the idea is that the probability for a positive integer n to be 
prime is 1/log n. What does this mean? It means that 7t(x) should be given with 
very good approximation by the sum 


L(x) 


11 1 » 

_l ^ 

log 2 log 3 log n = 2 


1 

log k 


where n is the largest integer ^ x. If x is taken to be an integer, then we take 
n = x. For those who have had calculus, you will see immediately that the above 
sum is a Riemann sum for the integral usually denoted by L/(x), namely 


Li(x) 



and that the sum differs from the integral by a small error, bounded independent- 
ly of x; in other words, L(x) = Li(x) + 0(1). 

The question is: How good is the approximation of 7t(x) by the sum L(x), or 
for that matter by the integral L/(x)? That’s where the big problem comes from. 
The following conjecture was made by Riemann around 1850. 

Riemann Hypothesis. We have 

7t(x) = L(x) 4- 0(x 1/2 log x). 


This means that the sum L(x) gives an approximation to 7c(x) with an error term 
which has the order of magnitude x 1/2 logx so roughly the square root of x, 
which is very small compared to x when x is large. You can verify this 
relationship experimentally by making up tables for 7c(x) and for L(x). You will 
find that the difference is quite small. 

Even knowing the Riemann hypothesis, lots of other questions would still 
arise. For instance, consider twin primes 

(3, 5), (5, 7), (11, 13), (17, 19), .... 


These are primes p such that p + 2 is also prime. Let n t (x) denote the number of 
twin primes ^ x. It is not known today whether there are infinitely many twin 
primes, but conjecturally it is possible to give an asymptotic estimate for their 
number. Hardy and Littlewood conjectured that there is a constant C t > 0 such 
that 


7t,(x) 


C, 


X 

(log x) 2 


and they determined this constant explicitly. 
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Finally, let tt s (x) denote the number of primes ^ x which are of the form 
n 2 4- 1. It is not known whether there are infinitely many such primes, but 
Hardy-Littlewood have conjectured that there is a constant C s > 0 such that 

x 1 ' 2 

7T S (X) ~ C s 

log X 


and they have determined this constant explicitly. The determination of such 
constants as C t or C s is not so easy and depends on subtle relations of 
dependence between primes. For an informal discussion of these problems with a 
general audience, and some references to original papers, cf. my books: The Beauty of 
Doing Mathematics , and Math talks for undergraduates (the talk on prime numbers), 
Springer-Verlag. 


I, §5. EQUIVALENCE RELATIONS AND CONGRUENCES 

Let S be a set. By an equivalence relation in S we mean a relation, writ- 
ten x ~ y, between certain pairs of elements of S, satisfying the following 
conditions: 

ER 1. We have x ~ x for all xeS. 

ER 2. If x ~ y and y ~ z then x ~ z. 

ER 3. If x ~ y then y ~ x. 


Suppose we have such an equivalence relation in S. Given an element 
x of S , let C x consist of all elements of S which are equivalent to 
x. Then all elements of C x are equivalent to one another, as follows 
at once from our three properties. (Verify this in detail.) Furthermore, 
you will also verify at once that if x, y are elements of S, then either 
C x = C y , or C x , C y have no element in common. Each C x is called an 
equivalence class. We see that our equivalence relation determines a 
decomposition of S into disjoint equivalence classes. Each element of 
a class is called a representative of the class. 

Our first example of the notion of equivalence relation will be the 
notion of congruence. Let n be a positive integer. Let x, y be integers. 
We shall say that x is congruent to y modulo n if there exists an integer m 
such that x — y = mn. This means that x — y lies in the ideal generated 
by n. If n # 0, this also means that x — y is divisible by n. We write the 
relation of congruence in the form. 


x = y (mod n). 
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It is then immediately verified that this is an equivalence relation, namely 
that the following properties are verified: 

(a) We have x = x (mod n). 

(b) If x = y and y = z (mod n\ then x = z (mod n). 

(c) If x = y (mod n ) then y = x (mod n). 

Congruences also satisfy further properties: 

(d) If x = y (mod n) and z is an integer, then xz = yz (mod n). 

(e) If x = y and x' = y' (mod n ), then xx' = yy' (mod n). Furthermore 

x + x = y + y' (mod n). 

We give the proof of the first part of (e) as an example. We can write 
x — y + mn and x' = / + m!n 
with some integers m, m\ Then 

xx 7 = (y -f imi)(y 7 + m'n) = yy' + mny' + ym'n + mm'nn , 

and the expression on the right is immediately seen to be equal to 

yy' + wn 

for some integer w, so that xx 7 = yy' (mod n), as desired. 

We define the even integers to be those which are congruent to 
0 mod 2. Thus n is even if and only if there exists an integer m such that 
n = 2m. We define the odd integers to be all the integers which are not 
even. It is trivially shown that an odd integer n can be written in the 
form 2m + 1 for some integer m. 


I, §5. EXERCISES 

1. Let n, cl be positive integers and assume 1 < cl < n. Show that n can be 
written in the form 


n = c 0 + c x d H + c k d k 


with integers c f such that 0 ^ c, < cl, and that these integers c, are uniquely 
determined. [Hint: For the existence, write n = qcl + c # by the Euclidean 
algorithm, and then use induction. For the uniqueness, use induction, 
assuming c 0 ,...,c r are uniquely determined; show that c r + 1 is then uniquely 
determined.] 
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2. Let m , n be non-zero integers written in the form 

m = p" • • • p‘ r and n = p^ 1 • • • pj r , 

where i v , j v are integers ^ 0 and p u . ..,p r are distinct prime numbers. 

(a) Show that the g.c.d. of m, n can be expressed as a product p k : ■■■ p k r r where 
/ci, are integers ^ 0. Express k v in terms of i v and j v . 

(b) Define the notion of least common multiple, and express the least common 
multiple of m, n as a product p\' * • • p k r r with integers /c v ^ 0. Express /c v in 
terms of i v and j v . 

3. Give the g.c.d. and l.c.m. of the following pairs of positive integers: 

(a) 5 3 2 6 7 3 and 225 

(b) 248 and 28. 

4. Let n be an integer ^ 2. 

(a) Show that any integer x is congruent mod n to a unique integer m such 

that 0 < n. 

(b) Show that any integer x # 0, relatively prime to n, is congruent to a 
unique integer m relatively prime to n, such that 0 < m < n. 

(c) Let (p(n) be the number of integers m relatively prime to n , such that 
0 < m < n. We call (p the Euler phi fuuction. We also define <p(l) = 1. If 
n = p is a prime number, what is (p(p )? 

(d) Determine (p(n ) for each integer n with 1 ^ n ^ 10. 

5. Chinese Remainder Theorem. Let n , ri be relatively prime positive integers. 

Let a, b be integers. Show that the congruences 

x = a (mod n), 
x = b (mod n f ) 

can be solved simultaneously with some xeZ. Generalize to several con- 
gruences x = a { mod where ... ,n r are pairwise relatively prime positive 

integers. 

6. Let a, b be non-zero relatively prime integers. Show that \/ab can be written 

in the form 


1 x y 

ab a b 


with some integers x, y. 

7. Show that any rational number a # 0 can be written in the form 



where x l9 ...,x„ are integers, p l5 ...,p„ are distinct prime numbers, and 
are integers ^ 0. 
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8. Let p be a prime number and n an integer, 1 ^ n ^ p — 1. Show that the 


9. For all integers x, y and all primes p show that (x -f y) p = x p + y p (mod p ). 

10. Let n be an integer ^ 2. Show by examples that the bionomial coefficient 


11. (a) Prove that a positive integer is divisible by 3 if and only if the sum of its 

digits is divisible by 3. 

(b) Prove that it is divisible by 9 if and only if the sum of its digits is 
divisible by 9. 

(c) Prove that it is divisible by 11 if and only if the alternating sum of its 
digits is divisible by 11. In other words, let the integer be 

n = a k a k -i ■ ■ ■ a 0 = a 0 + 10 4- o 2 10 2 + ■■ ■ 4- fl fc 10\ 0 ^ a { ^ 9. 

Then n is divisible by 11 if and only if a 0 — u, + a 2 — a 3 + • • • + ( — 1 ) k a k 
is divisible by 11. 

12. A positive integer is called palyndromic if its digits from left to right are the 
same as the digits from right to left. For instance, 242 and 15851 are palyn- 
dromic. The integers 11, 101, 373, 10301 are palyndromic primes. Observe 
that except for 11, the others have an odd number of digits. 

(a) Is there a palyndromic prime with four digits? With an even number of 
digits (except for 1 1)? 

(b) Are there infinitely many palyndromic primes? (This is an unsolved prob- 
lem in mathematics.) 




CHAPTER II 


Groups 


II, §1. GROUPS AND EXAMPLES 

A group G is a set, together with a rule (called a law of composition) 
which to each pair of elements x, y in G associates an element denoted 
by xy in G, having the following properties. 

GR 1. For all x, y , z in G we have associativity , namely 

(xy)z = x(yz). 

GR 2. There exists an element e of G such that ex = xe = x for all x 
in G. 

GR 3. If x is an element of G, then there exists an element y of G 
such that xy = yx = e. 

Strictly speaking, we call G a multiplicative group. If we denote the 
element of G associated with the pair (x, y) by x + y, then we write 
GR 1 in the form 


(x + y) + z = X + (y + z), 


GR 2 in the form that there exists an element 0 such that 


0+x=x+0=x 
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for all x in G, and GR 3 in the form that given xeG, there exists an 
element y of G such that 


x -b y = y -b x = 0. 

With this notation, we call G an additive group and x + y the sum. We shall 
use the + notation only when the group satisfies the additional rule 

x + y = y + x 

for all x, y in G. With the multiplicative notation, this is written xy — yx 
for all x, y in G, and if G has this property, we call G a commutative, or 
abelian group. 

We shall now prove various simple statements which hold for all 
groups. 

Let G be a group. The element e of G whose existence is asserted by 
GR 2 is uniquely determined. 

Proof. If e , e' both satisfy this condition, then 

e = ee = e. 

We call this element the unit element of G. We call it the zero element 
in the additive case. 

Let x e G. The element y such that yx = xy = e is uniquely determined. 
Proof. If z satisfies zx = xz = e , then 


z = ez = ( yx)z = y(xz) = ye — y. 

We call y the inverse of x, and denote it by x _1 . In the additive nota- 
tion, we write y = — x. 

We shall now give examples of groups. Many of these involve notions 
which the reader will no doubt have encountered already in other 
courses. 

Example 1. Let Q denote the rational numbers, i.e. the set of all frac- 
tions m/n where m, n are integers, and n ^ 0. Then Q is a group under 
addition. Furthermore, the non-zero elements of Q form a group under 
multiplication, denoted by Q*. 

Example 2. The real numbers and complex numbers are groups under 
addition. The non-zero real numbers and non-zero complex numbers are 
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groups under multiplication. We shall always denote the real and com- 
plex numbers by R and C respectively, and the group of non-zero ele- 
ments by R* and C* respectively. 

Example 3. The complex numbers of absolute value 1 form a group 
under multiplication. 

Example 4. The set consisting of the numbers 1, — 1 is a group under 
multiplication, and this group has 2 elements. 

Example 5. The set consiting of the numbers 1, —1, i, —i is a group 
under multiplication. This group has 4 elements. 

Example 6 (The direct product). Let G, G' be groups. Let G x G' be 
the set consisting of all pairs (x, x') with x e G and x' e G'. If (x, x') and 
(y, y') are such pairs, define their product to be (xy, x'y'). Then G x G' is 
a group. 

It is a simple matter to verify that all the conditions GR 1, 2, 3 are 
satisfied, and we leave this to the reader. We call G x G' the direct 
product of G and G'. 

One may also take a direct product of a finite number of groups. 
Thus if G l5 ...,G n are groups, we let 

n 

n Gi = G, X ••• X G„ 
i = 1 

be the set of all n-tuples (x 1? ...,x w ) with x^gG,. We define multiplication 
componentwise, and see at once that G l x • •• x G n is a group. If is 
the unit element of G h then (e l9 ... 9 e n ) is the unit element of the product. 

Example 7. The Euclidean space R” is nothing but the product 

R" = R x x R 

taken n times. In this case, we view R as an additive group. 

A group consisting of one element is said to be trivial. A group in 
general may have infinitely many elements, or only a finite number. If G 
has only a finite number of elements, then G is called a finite group, and 
the number of elements of G is called its order. The group of Example 4 
has order 2, and that of Example 5 has order 4. 

In Examples 1 through 5, the groups happen to be commutative. We 
shall find non-commutative examples later, when we study groups of per- 
mutations. See also groups of matrices in Chapter VI. 
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Let G be a group. Let x l5 ...,x„ be elements of G. We can then form 
their product, which we define by induction to be 


*!•••*„ = (*! • Xn-lKr 


Using the associative law of GR 1, one can show that one gets the same 
value for this product no matter how parentheses are inserted around its 
elements. For instance for n = 4, 


and also 


(X1X2XX3X4) = X!(x 2 (x 3 x 4 )) 
(XiX 2 )(x 3 X 4 ) = ((x 1 x 2 )x 3 )x 4 . 


We omit the proof in the general case (done by induction), because it 
involves slight notational complications which we don’t want to go into. 
The above product will also be written 

ru 

i = 1 

If the group is written additively, then we write the sum sign instead 
of the product sign, so that a sum of n terms looks like 

n 

X x i = C*1 + ' * * + x n- 1 ) + X n = X 1 + * * * + x n- 
i = 1 


The group G being commutative, and written additively, it can be shown 
by induction that the above sum is independent of the order in which 
x l5 ...,x„ are taken. We shall again omit the proof. For example, if 
n = 4, 

(*1 + X 2 ) + (*3 + X 4 ) = Xi + (x 2 + x 3 + X 4 ) 

= Xj + (x 3 + X 2 + x 4 ) 

= x 3 + (x x + x 2 + x 4 ). 


Let G be a group, and H a subset of G. We shall say that H is a 
subgroup if it contains the unit element, and if, whenever x, yeH , then 
xy and x -1 are also elements of H. (Additively, we write x + yeH and 
— xeH.) Then H is itself a group in its own right, the law of composi- 
tion in H being the same as that in G. The unit element of G constitutes 
a subgroup, which is called the trivial subgroup. Every group G is a 
subgroup of itself. 

Example 8. The additive group of rational numbers is a subgroup of 
the additive group of real numbers. The group of complex numbers of 
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absolute value 1 is a subgroup of the multiplicative group of non-zero 
complex numbers. The group {1, —1} is a subgroup of {1, —1, i, — i}- 

There is a general way of obtaining subgroups from a group. Let S 
be a subset of a group G, having at least one element. Let H be the set 
of elements of G consisting of all products x 1 --x n such that x f or x f 1 is 
an element of S for each /, and also containing the unit element. Then H 
is obviously a subgroup of G, called the subgroup generated by S. We 
also say that S is a set of generators of H. If S is a set of generators for 
H , we shall use the notation 


H = <S>. 

Thus if elements {x l5 . . . ,x r } form a set of generators for G, we write 

G = <x l5 . . . ,x r >. 

Example 9. The number 1 is a generator for the additive group of 
integers. Indeed, every integer can be written in the form 


1 + 1 + ■■■ + 1 
or 

- 1-1 1 , 

or it is the 0 integer. 

Observe that in additive notation, the condition that S be a set of 
generators for the group is that every element of the group not 0 can be 
written 

where x t eS or — x f e S. 

Example 10. Let G be a group. Let x be an element of G. If n is a 
positive integer, we define x" to be 

xx • x, 

the product being taken n times. If n = 0, we define x° = e. If n = — m 
where m is an integer > 0, we define 


It is then routinely verified that the rule 


.m + n _ 
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holds for all integers m, n. The verification is tedious but straightforward. 
For instance, suppose m, n are positive integers. Then 

X m X n = X- XX**X = x ■ x . 

m times n times m + n times 


Suppose again that m, n are positive integers, and m < n. Then (see 
Exercise 3) 



m times n times 


because the product of x _1 taken m times will cancel the product of x 
taken m times, and will leave x n ~ m on the right-hand side. The other 
cases are proved similarly. One could also formalize this by induction, 
but we now omit these tedious steps. 

Similarly, we also have the other rule of exponents, namely 



It is also tedious to prove, but it applies to multiplication in groups just 
as it applies to numbers in elementary school, because one uses only the 
law of composition, multiplication, and its associativity, and multi- 
plicative inverses for the proof. For instance, if m, n are positive integers, 
then 



n times mn times 


If m or n is negative, then one has to go through the definitions to see 
that the rule applies also, and we omit these arguments. 

If the group is written additively, then we write nx instead of x", and 
the rules read: 


(m + n)x = mx 4 nx and (mn)x = m(nx). 


Observe also that we have the rule 


(x T 1 =(* -1 )". 
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To verify this, suppose that n is a positive integer. Then (see Exercise 3) 

x • • -x x _1 • • ■ x -1 = e 

' ' S 

n times n times 

because we can use the definition xx -1 = e repeatedly. If n is negative, 
one uses the definition x~ m = (x -1 ) w with m positive to give the proof. 

Let G be a group and let ae G. Let H be the subset of elements of G 
consisting of all powers a n with ne Z. Then H is the subgroup generated 
by a. Indeed, H contains the unit element e = a °. Let a n , a m e H. Then 

a m a n = a m+H eH. 

Finally, {a n )~ 1 = a~ n e H. So H satisfies the conditions of a subgroup, 
and H is generated by a. 

Let G be a group. We shall say that G is cyclic if there exists an 
element a of G such that every element x of G can be written in the form 
a n for some integer n. The subgroup H above is the cyclic subgroup 
generated by a . 

Example 11. Consider the additive group of integers Z. Then Z is 
cyclic, generated by 1. A subgroup of Z is merely what we called an 
ideal in Chapter I. We can now interpret Theorem 3.1 of Chapter I as 
stating: 

Let H be a subgroup of Z. If H is not trivial , let d be the smallest 

positive integer in H. Then H consists of all elements nd , with ne Z, 

and so H is cyclic . 

We now look more closely at cyclic groups. Let G be a cyclic group 
and let a be a generator. Two cases can occur. 


Case 1. There is no positive integer m such that a m = e. Then for 
every integer n ^ 0 it follows that a n / e. In this case, we say that G is 
infinite cyclic, or that a has infinite order. In fact the elements 

a n with ne Z 


are all distinct. To see this, suppose that a r = a s with some integers r, 
se Z. Then a s r = e so s — r = 0 and s = r. For example, the number 2 
generates an infinite cyclic subgroup of the multiplicative group of 
complex numbers. Its elements are 


2 -5 , 2 -4 , i i i 1, 2, 4, 8, 2\ 2 5 , . .. . 


Case 2. There exists a positive integer m such that a m = e. Then we 
say that a has finite order, and we call m an exponent for a. Let J be the 
set of integers ne Z such that a n = e. Then J is a subgroup of Z. This 
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assertion is routinely verified: we have 0 eJ because a° = e by definition. 
If m, neJ then 

a m + n = a m a n = ee = ^ 

so m + ne J. Also a~ m = (a m ) _1 = e, so — meJ. Thus J is a subgroup of 
Z. By Theorem 3.1 of Chapter I, the smallest positive integer d in J is a 
generator of J. By definition, this positive integer d is the smallest 
positive integer such that a d = e , and d is called the period of a. If a n = e 
then n = ds for some integer s. 

Suppose a is an element of period d. Let n be an integer. By the 
Euclidean algorithm, we can write 

n = qd + r with q, r e Z and 0 ^ r < d. 


Then 

a n = a r . 

Theorem 1.1. Let G be a group and aeG. Suppose that a has finite 
order . Let d be the period of a. Then a generates a cyclic subgroup of 
order d , whose elements are e, a, ... ,a d ~ l . 

Proof. The remark just before the theorem shows that this cyclic 
subgroup consists of the powers e, a,...,a d ~ l . We must now show that 
the elements 

e, a, ...,a d ~ 1 

are distinct. Indeed, suppose a r = a s with 0 ^ r ^ d — 1 and 

0 ^ s ^ d - 1, 

say r <: s. Then a s r = e. Since 

0 ^ s — r < d, 

we must have s — r = 0, whence r = s. We conclude that the cyclic group 
generated by a in this case has order d. 

Example 12. The multiplicative group {1, —1} is cyclic of order 2. 

Example 13. The complex numbers {1, i, —1, — i} form a cyclic group 
of order 4. The number i is a generator. 

II, §1. EXERCISES 

1. Let G be a group and a , b , c be elements of G. If ab = ac , show that b = c. 

2. Let G, G be finite groups, of orders m, n respectively. What is the order of 

G x G'? 
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3. Let x 1; ...,x„ be elements of a group G. Show (by induction) that 

(xf-x-r 1 =x“ 1 ---xr i . 

What does this look like in additive notation? For two elements x, yeG , we 
have (xy) -1 = y -1 x -1 . Write this also in additive notation. 

4. (a) Let G be a group and x e G. Suppose that there is an integer n ^ 1 such 

that x" = e. Show that there is an integer m ^ 1 such that x _1 = x m . 

(b) Let G be a finite group. Show that given x e G, there exists an integer 

n ^ 1 such that x" — e. 

5. Let G be a finite group and S a set of generators. Show that every element of 
G can be written in the form 

where x^eS. 

6. Let G be a group such that x 2 = 1 for all xe G. Prove that G is abelian. 

7. There exists a group G of order 4 having two generators x, y such that 
x 2 = y 2 = e and xy = yx. Determine all subgroups of G. Show that 

G = {e, x, y, xy}. 

8. There exists a group G of order 8 having two generators x, y such that 
x 4 = y 2 = e and xy = yx 3 . Show that every element of G can be written in 
the form x l y j with integers i, j such that i = 0, 1, 2, 3 and j = 0, 1 . Conclude 
that these elements are distinct. Make up a multiplication table by writing the 
product of two elements in the blank spaces, and expressing them in the form 
x l y J with i = 0, 1, 2, 3 and j = 0, 1. 


e 

X 

X 2 

x 3 

y 

yx 

yx 2 

yx 3 

X 








X 2 








x 3 





yx 2 



y 








yx 








yx 2 








3 

yx 









We filled one entry, namely x 3 yx = yx 2 . 
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9. There exists a group G of order 8 having generators denoted by i, j, k such 
that 


ij = k, jk = i, ki = j, 
i 2 =j 2 = k 2 . 


Denote r by m. 

(a) Show that every element of G can be written in the form 

e, /, 7, k, m, mi , mj , mk , 

and hence that these are precisely the distinct elements of G. 

(b) Make up a multiplication table as in Exercise 8. 

The group G in Exercise 9 is called the quaternion group. One frequently 
writes —1, — i, —j,—k instead of m,mi,mj,mk. Cf. Chapter VI, §4 for the space of 
quaternions. 

10. There exists a group G of order 12 having generators x, y such that 

x 6 = y 2 = e and xy = yx 5 . 


Show that the elements x l y j with 0 g i ^ 5 and 0 ^ j ^ 1 are the distinct ele- 
ments of G, Make up a multiplication table as in the previous exercises. 

1 1. The groups of Exercises 8 and 10 have representations as groups of sym- 
metries. For instance, in Exercise 8, let o be the rotation which maps each 
corner of the square 



on the next corner (taking, say counterclockwise rotation), and let t be 
the reflection across the indicated diagonal. Show geometrically that a and t 
satisfy the relations of Exercise 8. Express in terms of powers of o and t 
the reflection across the horizontal line as indicated on the square. Using the 
notation of §6, we can write a = [1234] and t = [24]. 

12. In the case of Exercise 10, do the analogous geometric interpretation, taking a 
hexagon instead of a square. 

(Note: The groups of Exercises 11 and 12 can essentially be understood as 
groups of permutations of the vertices. Cf. Exercises 8 and 9 of §6.) 

13. Let G be a group and H a subgroup. Let xeG. Let xHx~ x be the subset of 
G consisting of all elements xyx -1 with yeH. Show that xHx~ l is a 
subgroup of G. 
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14. Let G be a group and let S be a set of generators of G. Assume that xy = yx 
for all x, yeS. Prove that G is abelian. Thus to test whether a group is 
abelian or not, it suffices to verify the commutative rule on a set of 
generators. 

Exercises on cyclic groups 

15. A root of unity in the complex numbers is a number ( such that (" = 1 for 
some positive integer n. We then say that ( is an n - th root of unity. Describe 
the set of n - th roots of unity in C. Show that this set is a cyclic group of 
order n. 

16. Let G be a finite cyclic group of order n . Show that for each positive integer 
cl dividing n, there exists a subgroup of order cl. 

17. Let G be a finite cyclic group of order n. Let a be a generator. Let r be an 
integer ^0, and relatively prime to n. 

(a) Show that a r is also a generator of G. 

(b) Show that every generator of G can be written in this form. 

(c) Let p be a prime number, and G a cyclic group of order p. How many 
generators does G have? 

18. Let m, n be relatively prime positive integers. Let G, G' be cyclic groups of 
orders m, n respectively. Show that G x G' is cyclic, of order mn. 

19. (a) Let G be a multiplicative finite abelian group. Let a be the product of all 

the elements of the group. Prove that a 2 = e. 

(b) Suppose in addition that G is cyclic. If G has odd order, show that a = e. 
If G has even order, show that a ^ e. 


II, §2. MAPPINGS 

Let S, S' be sets. A mapping (or map) from S to S' is an association which 
to every element of S associates an element of S'. Instead of saying that 
/ is a mapping of S into S', we shall often write the symbols /: S'. 

If f: S -> S' is a mapping, and x is an element of S, then we denote by 
/(x) the element of S' associated to x by /. We call /(x) the value of / 
at x, or also the image of x under /. The set of all elements /(x), for all 
xeS, is called the image of /. If T is a subset of S, then the set of ele- 
ments /(x) for all xeT is called the image of T, and denoted by /(T). 

If / is as above, we often write xi— >/(x) to denote the image of x 
under /. Note that we distinguish two types of arrows, namely 

-► and i— ►. 

Example 1. Let S and S' be both equal to R. Let /: R -► R be 
the mapping /(x) = x 2 , i.e. the mapping whose value at x is x 2 . We can 
also express this by saying that / is the mapping such that xh^x 2 The 
image of / is the set of real numbers ^ 0. 
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Let /: S -> S' be a mapping, and T a subset of S. Then we can define 
a map T -> S' by the same rule xi — >/(x) for xeT. In other words, we 
can view / as defined only on T. This map is called the restriction of / 
to T and is denoted by / | T: T -> S'. 

Let S, S' be sets. A map /: S->S' is said to be injective if whenever 
x, yeS and x / y then /(x)//(y). We could also write this condition 
in the form: if f(x) = f(y) then x = y. 

Example 2. The mapping / of Example 1 is not injective. Indeed, we 
have / (1) = /(— 1). Let g : R -> R be the mapping xi— >x + 1. Then g is 
injective, because if x ^ y then x + 1 / y + 1, i.e. g(x) ^ g(y). 

Let 5, 5' be sets. A map /: S -> S' is said to be surjective if the image 
f(S) of S is equal to all of S'. This means that given any element x'eS', 
there exists an element xeS such that / (x) = x'. One says that / is onto 
S'. 


Example 3. Let /: R -> R be the mapping f(x) = x 2 . Then / is not 
surjective, because no negative number is in the image of /. 

Let g:R->R be the mapping g(x) = x + 1. Then g is surjective, 
because given a number y, we have y = g(y — 1). 

Remark. Let R' denote the set of real numbers ^ 0. One can view 
the association x i — > x 2 as a map of R into R'. When so viewed, the map 
is then surjective. Thus it is a reasonable convention not to identify this 
map with the map /: R -> R defined by the same formula. To be com- 
pletely accurate, we should therefore incorporate the set of arrival and 
the set of departure of the map into our notation, and for instance write 

fl-.S^S’ 

instead of our /: S -► S'. In practice, this notation is too clumsy, so that 
one omits the indices 5, S'. However, the reader should keep in mind 
the distinction between the maps 

/{ | :R->R and /J: R -► R 

both defined by the rule x i— > x 2 . The first map is surjective whereas the 
second one is not. 

Let 5, S' be sets, and f:S^>S' a mapping. We say that / is bijective if 
/ is both injective and surjective. This means that given an element 
x' e S\ there exists a unique element xeS such that / (x) = x'. (Existence 
because / is surjective, and uniqueness because / is injective.) 
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Example 4. Let J n be the set of integers {1,2, A bijective map 
c. J n —> J n is called a permutation of the integers from 1 to n. Thus, in 
particular, a permutation a as above is a mapping i \— > a (/). We shall 
study permutations in greater detail later in this chapter. 

Example 5. Let S be a non-empty set, and let 

I.S^S 

be the map such that I(x) = x for all xeS. Then / is called the identity 
mapping, and also denoted by id. It is obviously bijective. Often we 
need to specify the set S in the notation and we write I s or id s for the 
identity mapping of S. Let T be a subset of S. The identity map t\— >t 
for teT , viewed as a mapping T -> S is called the inclusion, and is 
sometimes denoted by 

Tq S. 

Let 5, T, U be sets, and let 

f:S^T and g : T U 
be mappings. Then we can form the composite mapping 


defined by the rule 


go f:S—*U 


{g • /)(*) = g{f(x)) 


for all xeS. 

Example 6. Let /: R -> R be the map f(x) = x 2 , and g: R -> R the 
map g(x) = x + 1. Then g(f(x)) = x 2 -\- 1. Note that in this case, we 
can form also f(g(x)) = f(x + 1) = (x -I- l) 2 , and thus that 

f°9 * de- 
composition of mappings is associative. This means : Let S , T, L, K be 
sets , anrf 


f:S—>T, g:T^>U, k.U^V 

be mappings. Then 


h*(g°f) = (hog) of 
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Proof. The proof is very simple. By definition, we have, for any 
element xeS, 

(M g°f)Xx) = h((g f)(x)) = h(g(f(x))). 

On the other hand, 

i(h ° g) ° f)(x) = (hog)(f(x)) = h(g(f(x))). 

By definition, this means that (hog).f = /z«(0°/). 

Let S , T, U be sets , and f:S->T, g: T -> U mappings. If f and g are 
injective, then gof is injective. If f and g are surjective, then gof is 
surjective. If f and g are bijective , then so is gof. 

Proof. As to the first statement, assume that f, g are injective. Let 
x, yeS and x ^ y. Then /(x)^/(y) because / is injective, and hence 
g(f (x)) ^ g(f (y)) because g is injective. By the definition of the com- 
posite map, we conclude that gof is injective. The second statement will 
be left as an exercise. The third is a consequence of the first two and the 
definition of bijective. 

Let f:S^S' be a mapping. By an inverse mapping for / we mean a 
mapping 

S' ->S 

such that 

g°f = id s and fog = id s .. 

As an exercise, prove that if an inverse mapping for / exists, then it is 
unique, in the sense that if g u g 2 are inverse mappings for /, then 
#1=02 • We then denote the inverse mapping 0 by f~ l . Thus by 
definition, the inverse mapping / _1 is characterized by the property that 
for all xeS and x' e S' we have 

f-\f{x)) = x and f(f- 1 (x')) = x'. 

Let f:S^>S' be a mapping. Then f is bijective if and only if f has an 
inverse mapping. 

Proof. Suppose / is bijective. We define a mapping g\ S' -► S by the 
rule: For x' e S', let 

0(x') = unique element xeS such that /(x) = x'. 

It is immediately verified that g satisfies the conditions of being an 
inverse mapping, and therefore the inverse mapping. We leave the other 
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implication as Exercise 1, namely prove that if an inverse mapping for / 
exists, then / is bijective. 

Example 7. If /: R -> R is the map such that 

f(x) = x + 1, 

then / _1 : R -► R is the map such that / _1 (x) = x— 1. 

Example 8. Let R + denote the set of positive real numbers (i.e. real 
numbers > 0). Let h: R + -► R + be the map h{x) = x 2 . Then h is bijective, 
and its inverse mapping is the square root mapping, i.e. 

h~ *(x) = ^fx 

for all xeR, x > 0. 

Let S be a set. A bijective mapping f:S^S of S with itself is called a 
permutation of S. The set of permutations of S is denoted by 

Perm(S). 

Proposition 2.1. The set of permutations Perm(S) is a group , the law of 
composition being composition of mappings. 

Proof. We have already seen that composition of mappings is 
associative. There is a unit element in Perm(S), namely the identity I s . If 
/, g are permutations of S, then we have already remarked that fog and 
g o f are bijective, and f °g, g° f map S onto itself, so fog and go f are 
permutations of S. Finally, a permutation / has an inverse f~ l as 
already remarked, so all the group axioms are satisfied, and the 
proposition is proved. 

If cr, t are permutations of a set S, then we often write 
<tt instead of a • r, 

namely we omit the small circle when we compose permutations to fit the 
abstract formalism of the law of composition in a group. 

Example 9. Let’s go back to plane geometry. A mapping F: R 2 — ► R 2 
is said to be an isometry if F preserves distances, in other words, given 
two points P, Qe R 2 we have 


dist(P, Q) = dist (F(P), F(0). 
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In high school geometry, it should have been mentioned that rotations, 
reflections, and translations preserve distances, and so are isometries. It is 
immediate from the definition that a composite of isometries is an 
isometry. It is not immediately clear that an isometry has an inverse. 
However, there is a basic theorem: 

Let F be an isometry . Then there exist reflections R l ,...,R m ( through 
lines L 1? ...,L m respectively) such that 

F = R 1 -R m . 

The product is of course composition of mappings. If R is a reflection 
through a line L, then R o R = R 2 = / so R = R' 1 . Hence if we admit 
the above basic theorem, then we see that every isometry has an inverse, 
namely 

F l = r-'-r; 1 . 

m 1 

Since the identity 1 is an isometry, it follows that the set of isometries is 
a subgroup of the group of permutations of R 2 . For a proof of the above 
basic theorem, see, for instance, my book with Gene Murrow: Geometry . 

Remark. The notation / -1 is also used even when / is not bijective. 
Let X , Y be sets and let 

f\X-*Y 

be a mapping. Let Z be a subset of Y. We define the inverse image 

/ -1 (Z) = subset of X consisting of those elements xeX 
such that f(x)eZ. 

Thus in general / _1 is NOT a mapping from Y into X , but is a mapping 
from the set of subsets of Y to the set of subsets of X. We call / _1 (Z) 
the inverse image of Z under /. You can work out some properties of the 
inverse image in Exercise 6. Often the subset Z may consist of one 
element y. Thus, if ye Y we define f~ 1 (y) to be the set of all elements 
xeX such that f(x) = y. If y is not in the image of /, then f~ 1 (y) is 
empty. If y is in the image of /, then f~ 1 (y) may consist of more than 
one element. 

Example 10. Let /: R ->R be the mapping f(x) = x 1 . Then 


/-'(l) = {1, -1}, 


and / J ( — 2) is empty. 
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Example 11. Suppose f\X-*Y is the inclusion mapping, so X is a 
subset of Y. Then / _1 (Z) is the intersection, namely 

f~\Z) = ZnX. 

Coordinate maps. Let Y t (i = 1, ...,w) be sets. A mapping 

/:X^n Y t = Y, X ... X Y„ 

of X into the product is given by n mappings /,: X — > Y t such that 
f(x) = (.AM, ■ • ■ ,/„(*)) for all xeX. 

The maps j\ are called the coordinate mappings of /. 

II, §2. EXERCISES 

1. Let f:S->S' be a mapping, and assume that there exists a map g: S' -> S 
such that 

3°f= l s and f°9 = l s< 

in other words, / has an inverse. Show that / is both injective and surjective. 

2. Let tr 1 ,...,cJ r be permutations of a set S. Show that 

3. Let S be a non-empty set and G a group. Let M(S, G) be the set of map- 
pings of S into G. If /, geM(S,G\ define fg:S^G to be the map written 
such that ( fg)(x ) = f(x)g(x). Show that M(S , G) is a group. If G is written 
additively, how would you write the law of composition in M(S , G)? 

4. Give an example of two permutations of the integers {1,2,3} which do not 
commute. 

5. Let S be a set, G a group, and f:S^G a bijective mapping. For each x, 
yeS define the product 

xy=r l (fWf(y)). 

Show that this multiplication defines a group structure on S. 

6. Let X , Y be sets and f: X -> Y a mapping. Let Z be a subset of Y. Define 
/ “ 1 (Z) to be the set of all xeX such that f(x)eZ. Prove that if Z, W are 
subsets of 7 then 


f n W) = 
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II, §3. HOMOMORPHISMS 

Let G, G' be groups. A homomorphism 

/:G->G' 

of G into G ' is a map having the following property: For all x, yeG, we 
have 

f(xy)=f(x)f(y\ 

and in additive notation, /(x + y) = /(x) + f(y\ 

Example 1. Let G be a commutative group. The map xi— ^x -1 of G 
into itself is a homomorphism. In additive notation, this map looks like 
x i — * — x. The verification that it has the property defining a homo- 
morphism is immediate. 

Example 2. The map 


z i— ► |z| 

is a homomorphism of the multiplicative group of non-zero complex 
numbers into the multiplicative group of non-zero complex numbers (in 
fact, into the multiplicative group of positive real numbers). 

Example 3. The map 


is a homomorphism of the additive group of real numbers into the mul- 
tiplicative group of positive real numbers. Its inverse map, the logarithm, 
is also a homomorphism. 

Let , G, H be groups and suppose H is a direct product 

H = x ••• x H n . 

Let f.G^H be a map , and let fp.G^H^ be its i-th coordinate map. 
Then f is a homomorphism if and only if each f is a homomorphism. 

The proof is immediate, and will be left to the reader. 

For the sake of brevity, we sometimes say: “Let f.G^G' be a 
group-homomorphism” instead of saying: “Let G, G' be groups, and let / 
be a homomorphism of G into G'.” 
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Let f:G-*G' be a group-homomorphism , and let e , e' be the unit ele- 
ments of G, G' respectively. Then f(e) = e' . 

Proof. We have f(e) = f(ee) = f(e)f{e). Multiplying both sides by 
f(e)~ 1 gives the desired result. 

Let f.G^G' be a group-homomorphism. Let xeG. Then 

f(x- 1 )=f(x)K 

Proof We have 


e’ = f(e) = f(xx x ) = f(x)f(x x ). 

Let f\G->G’ and g: G' -> G" be group-homomorphisms. Then the com- 
posite map gof is a group-homomorphism of G into G". 

Proof We have 

(g°f)(xy) = g(f(xy)) = g(f(x)f(y)) = g(f(x))g(f(y)). 

Let f:G-*G' be a group- homomorphism. The image of f is a subgroup 
of G. 

Proof If x = f (x) with xeG, and / = f (y) with y e G, then 

= f(x)f(y) = f{xy) 

is also in the image. Also, e' = f(e) is in the image, and jc ,_1 = f(x~ l ) is 
in the image. Hence the image is a subgroup. 

Let /:G->G' be a group-homomorphism. We define the kernel of / 
to consist of all elements xeG such that / (x) = e'. 

The kernel of a homomorphism f.G^G’isa subgroup of G. 

The proof is routine and will be left to the reader. (The kernel contains 
the unit element e because f(e) is the unit element of G'. And so on.) 

Example 4. Let G be a group and let a e G. The map 

n\-+a n 

is a homomorphism of Z into G. This is merely a restatement of the 
rules for exponents discussed in §1. The kernel of this homomorphism 
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consists of all integers n such that a n = e , and as we have seen in §1, this 
kernel is either 0, or is the subgroup generated by the period of a. 

Let f:G-*G' be a group-homomorphism . If the kernel of f consists of 

e alone , then f is injective. 

Proof Let x, yeG and suppose that f(x) = f(y). Then 
e' = Ax) Ayr 1 =f(x)f(y~ 1 ) =f(xy~ 1 ). 

Hence xy~ l = e, and consequently x = y, thus showing that / is injective. 

An injective homomorphism will be called an embedding. The same 
terminology will be used for other objects which we shall meet later, such 
as rings and fields. An embedding is sometimes denoted by the special 
arrow 

G G'. 

In general, let /: X -> Y be a mapping of sets. Let Z be a subset of Y. 
Recall from §2 that we defined the inverse image: 

/ _1 (Z) = subset of elements xeX such that f(x)eZ. 

Let f:G-*G' be a homomorphism and let H' be a subgroup of G\ Let 

H = f~ l (H') be its inverse image , i.e. the set of xeG such that 

f(x) e H'. Then H is a subgroup of G. 

Prove this as Exercise 8. 

In the above statement, let us take H' = {e'}. Thus H' is the trivial 
subgroup of G'. Then f~ l (H') is the kernel of / by definition. 

Let /:G->G' be a group-homomorphism. We shall say that / is an 
isomorphism (or more precisely a group-isomorphism) if there exists a 
homomorphism g:G'->G such that fog and gof are the identity 
mappings of G' and G respectively. We denote an isomorphism by the 
notation 

G « G'. 

Remark. Roughly speaking, if a group G has a property which can be 
defined entirely in terms of the group operation, then every group 
isomorphic to G also has the property. Some of these properties are: 
having order n, being abelian, being cyclic, and other properties which 
you will encounter later, such as being solvable, being simple, having a 
trivial center, etc. As you encounter these properties, verify the fact that 
they are invariant under isomorphisms. 
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Example 5. The function exp is an isomorphism of the additive group 
of the real numbers onto the mutiplicative group of positive real 
numbers. Its inverse is the log. 

Example 6. Let G be a commutative group. The map 

/: x i— ► x“ 1 

is an isomorphism of G onto itself. What is /• /? What is / -1 ? 

A group-homomorphism f.G^G' which is injective and surjective is an 
isomorphism. 

Proof. We let / ~ 1 : G' -> G be the inverse mapping. All we need to 
prove is that / _1 is a group-homomorphism. Let x', y'eG', and let 
x, yeG be such that /(x) = x' and f(y) = y'. Then f(xy) = x'y'. Hence 
by definition, 

f-\x'y') = xy = f'\x')f-\y'). 

This proves that / -1 is a homomorphism, as desired. 

From the preceding condition for an isomorphism, we obtain the 
standard test for a homomorphism to be an isomorphism. 

Theorem 3.1. Let f:G^>G'bea homomorphism. 

(a) If the kernel of f is trivial , then f is an isomorphism of G with its 
image /(G). 

(b) If f:G^> G' is surjective and the kernel of f is trivial , then f is an 
isomorphism. 

Proof. We have proved previously that if the kernel of / is trivial 
then / is injective. Since / is always surjective onto its image, the 
assertion of the theorem follows from the preceding condition. 

By an automorphism of a group, one means an isomorphism of the 
group onto itself. The map of Example 6 is an automorphism of the 
commutative group G. What does it look like in additive notation? Ex- 
amples of automorphisms will be given in the exercises. (Cf. Exercises 3, 
4, 5.) Denote the set of automorphisms of G by Aut(G). 

Aut(G) is a subgroup of the group of permutations of G, where the 
law of composition is composition of mappings. 

Verify this assertion in detail. See Exercise 3. 
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We shall now see that every group is isomorphic to a group of per- 
mutations of some set. 

Example 7 (Translation). Let G be a group. For each aeG, let 

T a :G G 

be the map such that T a (x) = ax. We call T a the left translation by a. 

We contend that T a is a bijection of G onto itself, i.e. a permutation of 

G. If x ^ y then ax # ay (multiply on the left by a~ l to see this), and 

hence T a is injective. It is surjective, because given xeG, we have 


X = T a (a 'x). 

The inverse mapping of T a is obviously T a ~u Thus the map 

a^T a 

is a map from G into the group of permutations of the set G. We con- 
tend that it is a homomorphism. Indeed, for a , beG we have 

T ah (x) = abx = T a (T b (xj), 

so that Tab — T a T b . Furthermore, one sees at once that this homomor- 
phism is injective. Thus the map 

a\-^T a (aeG) 

is an isomorphism of G onto a subgroup of the group of all permuta- 
tions of G. Of course, not every permutation need be given by a transla- 
tion, i.e. the image of the map is not necessarily equal to the full group 
of permutations of G. 

The terminology of Example 7 is taken from Euclidean geometry. Let 
G = R 2 = R x R. We visualize G as the plane. Elements of G are called 
2-dimensional vectors. If A e R x R, then the translation 

T^:RxR->RxR 

such that T a (X) = X + A for all Xe R x R is visualized as the usual 
translation of X by means of the vector A. 

Example 8 (Conjugation). Let G be a group and let aeG. Let 

c a-.G^G 

be the map defined by x\-^axa~ l . This map is called conjugation by a. 
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In Exercises 4 and 5 you will prove: 

A conjugation c a is an automorphism of G , called an inner automorphism. 
The association a i— ► c a is a homomorphism of G into Aut(G), whose law of 
composition is composition of mappings . 

Let A be an abelian group, written additively. Let B , C be subgroups. 
We let B + C consist of all sums b 4- c, with beB and ceC. You can 
show as an exercise that B + C is a subgroup, called the sum of B and C. 
You can define the sum B x + ••• + B r of a finite number of subgroups 
similarly. We say that A is the direct sum of B and C if every element 
xe A can be written uniquely in the form x = b + c with beB and ceC. 
We then write 

A = B © C. 

Similarly we define A to be the direct sum 
if every element xe,4 can be written in the form 

r 

X = X bi = + ••• + b r 

i= 1 

with elements ^gB,- uniquely determined by x. In Exercise 14 you will 
prove: 

Theorem 3.2. The abelian group A is a direct sum of subgroups B and 
C if and only if A = B + C and B n C = {0}. This is the case if and 
only if the map 

B x C -> A given by (fr, c)\-^>b + c 
is an isomorphism. 

Example 9 (The group of homomorphisms). Let A, B , be abelian 
groups, written additively. Let Hom(/l, B ) denote the set of homomor- 
phisms of A into B. We can make Hom(>l, B) into a group as follows. 
If /, g are homomorphisms of A into B , we define / + g\ A -► B to be the 
map such that 

(/ + g)(x) = m + g(x) 

for all x e A. It is a simple matter to verify that the three group axioms 
are satisfied. In fact, if f g , /ig H om(/4, B), then for all xeA, 


(if + g) + h)(x) = (/+ g)(x) + h(x) = f(x) + g(x) + h(x), 
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and 

(/ + (0 + h))(x) = f(x) + (g + h)(x) = fix ) + gix) + hix). 

Hence / + (g 4 - h) = (/ + g) + h. We have an additive unit element, 
namely the map 0 (called zero) which to each element of A assigns the 
zero element of B. It obviously satisfies condition GR 2. Furthermore, 
the map — / such that (— /)(x) = — /(x) has the property that 


/+(-/) = 0. 


Finally , we must of course observe that f + g and —f are homomorphisms. 
Indeed, for x, ye A, 

if + gXx + V) = fix + y) + gix + >’) = fix) + fiy) + gix) + giy) 

= fix) + gix) + fiy) + giy) 

= if + g)(x) + if + g)iy), 

so that f + g is a homomorphism. Also, 

(-/)(* + y) = -fix + y) = -(fix) + fiy)) = -fix) - fiy), 

and hence — / is a homomorphism. This proves that Hom(/l, B ) is a 
group. 


II, §3. EXERCISES 

1. Let R* be the multiplicative group of non-zero real numbers. Describe 
explicitly the kernel of the absolute value homomorphism 

x |x| 

of R* into itself. What is the image of this homomorphism? 

2. Let C* be the multiplicative group of non-zero complex numbers. What is 
the kernel of the homomorphism absolute value 

z 1 — ► |z| 

of C* into R*? 

3. Let G be a group. Prove that Aut(G) is a subgroup of Perm(G). 

4. Let G be a group. Let a be an element of G. Let 
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be the map such that 

c a (jt) = axa ~ x . 

(a) Show that c a : G —> G is an automorphism of G. 

(b) Show that the set of all such maps c a with 4 e G is a subgroup of Aut(G). 

5. Let the notation be as in Exercise 4. Show that the association 4 i— > c a is a 
homomorphism of G into Aut(G). The image of this homomorphism is called 
the group of inner automorphisms of G. Thus an inner automorphism of G is 
one which is equal to c a for some 4 e G. 

6. If G is not commutative, is xi— >x _1 a homomorphism? Prove your assertion. 

7. (a) Let G be a subgroup of the group of permutations of a set S. If s, t are 

elements of S, we define s to be equivalent to t if there exists aeG such 
that as = t. Show that this is an equivalence relation. 

(b) Let s e S, and let G s be the set of all aeG such that as = s . Show that G s 
is a subgroup of G. 

(c) If reG is such that zs = t , show that G t = tG s t -1 . 

8. Let / : G -» G' be a group homomorphism. Let H' be a subgroup of G'. Show 

that is a subgroup of G. 

9. Let G be a group and S a set of generators of G. Let /: G » G' and g: G -> G 
be homomorphisms of G into the same group G\ Suppose that f(x) = §{x) 
for all xeS. Prove that f = g. 

10. Let f:G * G' be an isomorphism of groups. Let «eG. Show that the period 
of 4 is the same as the period of f(a). 

11. Let G be a cyclic group, and f:G + G' a homomorphism. Show that the 
image of G is cyclic. 

12. Let G be a commutative group, and n a positive integer. Show that the map 
x i— > x n is a homomorphism of G into itself. 

13. Let A be an additive abelian group, and let B, C be subgroups. Let B + C 
consist of all sums b + c, with beB and ceC. Show that B + C is a 
subgroup, called the sum of B and C. 

1 4. (a) Give the proof of Theorem 3.2 in all details. 

(b) Prove that an abelian group A is a direct sum of subgroups B l9 ...,B r if 
and only if the map 

{b j , ... i — > b i + • * • + b r of Y\ &i — * ^4 

is an isomorphism. 

15. Let A be an abelian group, written additively, and let n be a positive integer 
such that nx = 0 for all xe A. Such an integer n is called an exponent for A. 
Assume that we can write n = rs, where r, s are positive relatively prime 
integers. Let A r consist of all xeA such that rx = 0, and similarly A s consist 
of all xe A such that sx = 0. Show that every element tie A can be written 
uniquely in the form 4 = b + c, with beA r and ceA s . Hence A = A r © A s . 
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16. Let A be a finite abelian group of order n , and let n = p\ v ■ ■ ■ p[ s be its prime 
power factorization, the p, being distinct, (a) Show that A is a direct sum 
A = A i ® • • • © A s where every element « e Ai satisfies p\ l * = 0. (b) Prove that 
#M/) = Pi'- 
ll. Let G be a finite group. Suppose that the only automorphism of G is the 
identity. Prove that G is abelian and that every element has order 2. [After 
you know the structure theorem for abelian groups, or after you have read 
the general definition of a vector space, and viewing G as vector space over 
Z/2Z, prove that G has at most 2 elements.] 


II, §4. COSETS AND NORMAL SUBGROUPS 

In what follows, we need some convenient notation. Let S , S' be subsets 
of a group G. We define the product of these subsets to be 

SS ' = the set of all elements xx! with xeS and x! e S'. 

It is easy to verify that if S u S 2 , S 3 are three subsets of G, then 

(S,S 2 )S 3 = S t (S 2 S 3 ). 

This product simply consists of all elements xyz , with xeS u yeS 2 and 
zeS 3 . Thus the product of subsets is associative. 

Example 1. Show that if H is a subgroup of G, then HH = H. Also if 
S is a non-empty subset of //, then SH = H. Check for yourself other 
properties, for instance 

Let G be a group, and H a subgroup. Let m be an element of G. The 
set of all elements ax with xe H is called a coset of H in G. We denote it 
by aH , following the above notation. 

In additive notation, a coset of H would be written m + H. 

Since a group G may not be commutative, we shall in fact call aH a 
left coset of H. Similarly, we could define right cosets, but in the sequel, 
unless otherwise specified, coset will mean left coset. 


Theorem 4.1. Let mH mnd bH be cosets of H in the group G. Either 
these cosets are equal , or they hmve no element in common . 

Proof Suppose that aH and bH have one element in common. We 
shall prove that they are equal. Let x, y be elements of H such that 
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ax = by. Since xH = H = yH (see Example 1) we get 


aH = axH = byH = bH 


as was to be proved. 

Suppose that G is a finite group. Every element xe G lies in some 
coset of H, namely xexH. Hence G is the union of all cosets of H. By 
Theorem 4.1, we can write G as the union of distinct cosets, so 

G= 0 0i H, 

i= 1 

where the cosets a x H,...,a r H are distinct. When we write G in this 
manner, we say that G = a t H is a coset decomposition of G, and that 
any element ah with he H is a coset representative of the coset aH. In a 
coset decomposition of G as above, the coset representatives a u ... ,a r 
represent distinct cosets, and are of course all distinct. 

If a and b are coset representatives for the same coset, then 

aH = bH. 

Indeed, we can write b — ah for some he H, and then 


bH = ahH = aH. 


If G is an infinite group, we can still write G as a union of distinct 
cosets, but there may be infinitely many of these cosets. Then we use the 
notation 


G = (J a { H 

iel 


where / is some indexing set, not necessarily finite. 

Theorem 4.2. Let G be a group , and H a finite subgroup. Then the 
number of elements of a coset aH is equal to the number of elements in 
H. 

Proof. Let x, x' be distinct elements of H. Then ax and ax' are 
distinct, for if ax — ax' , then multiplying by a~ l on the left shows that 
x = x r . Hence, if x 1? ...,x n are the distinct elements of H , then ax u ...,ax n 
are the distinct elements of aH, whence our assertion follows. 


[II, §4] 


COSETS AND NORMAL SUBGROUPS 


43 


Let G be a group and let H be a subgroup. The set of left cosets of H 
is denoted by 


G/H. 


We’ll not run across the set of right cosets, but the notation is H\G for 
the set of right cosets, in case you are interested. 

The number of distinct cosets of H in G is called the index of H in G. 
This index may of course be infinite. If G is a finite group, then the 
index of any subgroup is finite. The index of a subgroup H is denoted by 
(G : H). Let #S denote the number of elements of a set S. As a matter of 
notation, we often write #(G/H) = (G:H), and 


# G = (G : 1), 

in other words, the order of G is the index of the trivial subgroup in G. 


Theorem 4.3. Let G be a finite group and H a subgroup. Then : 

(1) order of G = (G : H) (order of H). 

(2) The order of a subgroup divides the order of G. 

(3) Let aeG. Then the period of a divides the order of G. 

(4) If G =) H =5 K are subgroups, then 

(G:K) = (G: H)(H : K). 

Proof. Every element of G lies in some coset (namely, a lies in the 
coset aH since a = ae). By Theorem 4.1, every element lies in precisely 
one coset, and by Theorem 4.2, any two cosets have the same number of 
elements. Formula (1) of our corollary is therefore clear. Then we see 
that the order of H divides the order of G. The period of an element a is 
the order of the subgroup generated by a, so (3) also follows. As to the 
last formula, we have by (1): 

#G = (G : H)#H = (G : H)(H : K)#K 


and also 


#G = (G \ K)# K. 

From this (4) follows immediately. 

Example 2. Let S n be the group of permutations of (1 Let H 
be the subset of S n consisting of all permutations o such that o(n) = n 
(i.e. all permutations leaving n fixed). It is clear that H is a subgroup, 
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and we may view H as the permutation group S n _ l . (We assume n > 1.) 
We wish to describe all the cosets of H. For each integer i with 1 ^ i ^ n, 
let t i be the permutation such that zfn) = i, z fi) = n , and leaves all 
integers other than n and i fixed. We contend that the cosets 

x x H 9 ...,x m H 

are distinct, and constitute all distinct cosets of H in S n . 

To see this, let aeS n , and suppose o (n) = i. Then 

rr l a(n) = t^O) = n. 

Hence t “ 1 a lies in H, and therefore a lies in t,H. We have shown that 
every element of G lies in some coset t,//, and hence z 1 H,.. . ,z n H yield 
all the cosets. We must still show that these cosets are distinct. If i ^ j, 
then for any oeH , = T,(n) = i and z j o(n) = z j (ri)=j. Hence t,// 

and ZjH cannot have any element in common, since elements of z i H and 
ZjH have distinct effects on n. This proves what we wanted. 

From Theorem 4.3, we conclude that 

order of S n = n- order of S n _ l . 

By induction, we see immediately that 


order of S n = nl = n(n — 1 ) • • - 1. 


Theorem 4.4. Let be a homomorphism of groups. Let H be 

its kernel , and let a' be an element of G' which is in the image of f say 
a' = f(a) for aeG. Then the set of elements x in G such that f(x) = a' 
is precisely the coset aH. 

Proof Let xeaH , so that x = ah with some heH. Then 

m = mm = m. 

Conversely, suppose that xeG, and f(x ) = a'. Then 
f(a~ 1 x) = f(a)~ 1 f(x) = a'~ l a’ = e'. 

Hence a -1 * lies in the kernel H, say a~ l x = h with some heH. Then 
x = ah , as was to be shown. 
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Let G be a group and let H be a subgroup. We shall say that H is 
normal if H satisfies either one of the following equivalent conditions: 

NOR 1. For all xeG we have xH = Hx, that is xHx x = H. 

NOR 2. H is the kernel of some homomorphism of G into some 
group. 

We shall now prove that these two conditions are equivalent. Suppose 
first that H is the kernel of a homomorphism /. Then 

/(x//x- 1 ) = /(x)/(//)/(x)- 1 = 1. 

Hence xHx~ l ^H for all xeG, so x' 1 Hx c= H, whence HaxHx~ l . 
Hence xHx~ l = H. This proves NOR 2 implies NOR 1. The converse 
will be proved in Theorem 4.5 and Corollary 4.6. 

Warning. The condition in NOR 1 is not the same as saying that 
xhx~ l = h for all elements heH , when G is not commutative. Observe 
however that a subgroup of a commutative group is always normal, and 
even satisfies the stronger condition than NOR 1, namely xhx~ l = h for 
all h<=H. 

We now prove that NOR 1 implies NOR 2 by showing how a 
subgroup satisfying NOR 1 is the kernel of a homomorphism. 

Theorem 4.5. Let G be a group and H a subgroup having the property 
that xH = Hx for all xeG. If aH and bH are cosets of H , then the 
product ( aH)(bH ) is also a coset , and the collection of cosets is a group , 
the product being defined as above. 


Proof. We have (aH)(bH) = aHbH = abHH = abH. Hence the pro- 
duct of two cosets is a coset. Condition GR 1 is satisfied in view of 
previous remarks on multiplication of subsets of G. Condition GR 2 is 
satisfied, the unit element being the coset eH = H itself. (Verify this in 
detail.) Condition GR 3 is satisfied, the inverse of aH being a~ l H. 
(Again verify this in detail.) Hence Theorem 4.5 is proved. 

The group of cosets in Theorem 4.5 is called the factor group of G by 
//, or G modulo H. We note that it is a group of left or right 
cosets, there being no difference between these by assumption on H. 
We emphasize that it is this assumption which allowed us to define multi- 
plication of cosets. If the condition xH = Hx for all xeG is not 
satisfied, then we cannot define a group of cosets. 
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Corollary 4.6. Let G be a group and H a subgroup having the property 
that xH = Hx for all xeG. Let G/H be the factor group , and let 

f'G^G/H 

be the map which to each aeG associates the coset f(a) = aH. Then f 
is a homomorphism , and its kernel is precisely H. 

Proof The fact that / is a homomorphism is nothing but a repeti- 
tion of the properties of the product of cosets. As for its kernel, it is 
clear that every element of H is in the kernel. Conversely, if xeG, and 
f (x) = xH is the unit element of G/H , it is the coset H itself, so xH = H . 
This means that xe = x is an element of H, so H is equal to the kernel 
of f as desired. 

We call the homomorphism / in Corollary 4.6 the canonical homo- 
morphism of G onto the factor group G/H . 

Let /: G -> G' be a homomorphism, and let H be its kernel. Let xeG. 
Then for all he H we have 

f(xh) = f(x)f(h) = f(x). 

We can rewrite this property in the form 

f(xH) = f(x). 

Thus all the elements in a coset of H have the same image under f. This 
is an important fact, which we shall use in the next result, which is one of 
the cornerstones for arguments involving homomorphisms. You should 
master this result thoroughly. 


Corollary 4.7. Let f.G^G' be a homomorphism , and let H be its 
kernel. Then the association xH h* f(xH) is an isomorphism 

G/H A Im / 


of G/H with the image of f. 

Proof. By the remark preceding the corollary, we can define a map 
/:G/tf-G' by xH\-*f(xH). 
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Remember that G/H is the set of cosets of H. By Theorem 3.1, we have 
to verify three things: 

/ is a homomorphism. Indeed, 

J(xHyH) = J(xyH) = f(xy) = f(x)f(y) = f(xH)f(yH). 

f is injective. Indeed, the kernel of / consists of those cosets xH 
such that f(xH) = e\ so consists of H itself, which is the unit element of 
G/H . 

The image of / is the image of /, which follows directly from the way 
f is defined. 

This proves the corollary. 

The isomorphism / of the corollary is said to be induced by /. Note 
that G/H and the image of / are isomorphic not by any random 
isomorphism, but by the map specified in the statement of the corollary, 
in other words they are isomorphic by the mapping /. Whenever one 
asserts that two groups are isomorphic, it is best to specify what is the 
mapping giving the isomorphism. 

Example 3. Consider the subgroup Z of the additive group of the real 
numbers R. The factor group R/Z is sometimes called the circle group. 
Two elements xjeR are called congruent mod Z if x — ye Z. This 
congruence is an equivalence relation, and the congruence classes are 
precisely the cosets of Z in R. If x = y (mod Z), then e 2nix = e 2niy , and 
conversely. Thus the map 


x i — > e 2nix 

defines an isomorphism of R/Z with the multiplicative group of complex 
numbers having absolute value 1. To prove these statements, one must 
of course know some facts of analysis concerning the exponential func- 
tion. 

Example 4. Let C* be the multiplicative group of non-zero complex 
numbers, and R + the multiplicative group of positive real numbers. 
Given a complex number a ^ 0, we can write 


a = ru , 

where re R + and u has absolute value 1. (Let u = ol/\cl\.) Such an ex- 
pression is uniquely determined, and the map 

a 
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is a homomorphism of C* onto the group of complex numbers of abso- 
lute value 1. The kernel is R + , and it follows that C*/R + is isomorphic 
to the group of complex numbers of absolute value 1. (Cf. Exercise 14.) 

For coset representatives in Examples 3 and 4, cf. Exercises 15 and 16. 

The exercises list a lot of other basic facts about normal subgroups 
and homomorphisms. The proofs are all easy, and it is better for you to 
carry them out than to clutter up the text with them. You will learn 
them better for that. You should know these basic results as a matter of 
course. Especially, you will find a very useful test for a subgroup to be 
normal, namely: 

Example 5. Let H be a subgroup of a finite group G, and suppose that 
the index (G : H ) is equal to the smallest prime number dividing the order 
of G. Then H is normal. In particular, a subgroup of index 2 is normal. 
See Exercises 29 and 30. 

We shall now give a description of some standard cases of homomor- 
phisms and isomorphisms. The easiest case is the following. 

Let K cz H cz G be normal subgroups of a group G. Then the associa- 
tion 

xK h* xH for xe G 

is a surjective homomorphism 


G/K G/tf, 

which we also call the canonical homomorphism. The kernel is H/K. 

The verification is immediate and will be left to you. 

Note that following Corollary 4.7, we have the formula 

G/H * (G/K)/(H/K\ 

which is analogous to an elementary rule of arithmetic. 

Example. Let G = Z be the additive group of integers. The subgroups 
of Z are simply the sets of type nZ. Let m, n be positive integers. We 
have nZ a mZ if and only if m divides n. (Proof?) Thus if m\n , we have 
a canonical homomorphism 

Z/nZ — ► Z/mZ. 

If we write n = md , then we can also write the canonical homomorphism 
as 

Z/mdZ -> Z/mZ. 
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The following facts are used constantly, and you will prove them as 
Exercises 7 and 9, using Corollary 4.7. 

Theorem 4.8. Let f:G^G' be a surjective homomorphism . Let H’ be a 
normal subgroup of G' and let H = f~ l (H'). Then H is normal , and the 
map xi— ► f(x)H' is a homomorphism of G onto G'/H' whose kernel is H. 
Therefore we obtain an isomorphism 

G/H % G'/H'. 

Note that the homomorphism xh f(x)H' of G into G'/H' can also 
be described as the composite homomorphism 

G 4 G’/H’ 

where G' -> G'/H' is the canonical homomorphism. Using Theorem 4.8, 
you can prove: 

Theorem 4.9. Let G be a group and H a subgroup. Let N be a 

normal subgroup of G. Then : 

(1) HN is a subgroup of G; 

(2) H n N is a normal subgroup of H\ 

(3) The association 

f: h i— > hN 

is a homomorphism of H into G/N , whose kernel is H n N. The 
image of f is the subgroup HN/N of G/N , so we obtain an 
isomorphism 

f: H/(H nN)4 HN/N. 

Remark. Once you show that H n N is the kernel of the association 
in (3), then you have also proved (2). 

Theorem 4.8 is important in the following context. We define a group 
G to be solvable if there exists a sequence of subgroups 

G = H 0 =H 1 =>-=H r = {e} 

such that H i + l is normal in H t for i = 0 ,...,/* — 1, and such that Hi/H i+l 
is abelian. It is a problem to determine which groups are solvable, and 
which are not. A famous theorem of Feit-Thompson states that every 
group of odd order is solvable. In §6 you will see a proof that the 
permutation group on n elements is not solvable for n ^ 5. The groups 
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of Theorem 3.8 in Chapter VI give other examples. As an application of 
Theorem 4.8, we shall now prove: 

Theorem 4.10. Let G be a group and K a normal subgroup . Assume 
that K and G/K are solvable. Then G is solvable. 

Proof. By definition, and the assumption that K is solvable, it suffices 
to prove the existence of a sequence of subgroups 

G = H 0 ^H, =>• ~^H m = K 

such that //j + i is normal in H t and Hi/H i + l is abelian, for all i. Let 
G = G/K. By assumption, there exists a sequence of subgroups 

such that H i + l is normal in Hi and Hi/H i + l is abelian for all i. Let 
f:G^G/K be the canonical homomorphism and let H t = f~ l (Hi). By 
Theorem 4.8 we have an isomorphism H i /Hi^ i ^H i /Hi + l and 
K = f ~ l (H J, so we have found the sequence of subgroups of G as we 
wanted, proving the theorem. 

In mathematics, you will meet many structures, of which groups are 
only one basic type. Whatever structure one meets, one asks system- 
atically for a classification of these structures, and especially one tries to 
answer the following questions: 

1. What are the simplest structures in the category under considera- 
tion? 

2. To what extent can any structure be expressed as a product of 
simple ones? 

3. What is the structure of the group of automorphisms of a given 
object? 

In a sense, the objects having the “simplest” structure are the building 
blocks for the more complicated objects. Taking direct products is an 
easy way to form more complicated objects, but there are more complex 
ways. For instance, we define a group to be simple if it has no normal 
subgroup except the group itself and the trivial subgroup consisting of 
the unit element. Let G be a finite group. Then one can find a sequence 
of subgroups 


G = H 0 D// 1 D// 2 D.-o// f ={e} 


such that H k + l is normal in H k and such that H k /H k ^ l is simple. 
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The Jordan-Holder theorem states that at least the sequence of simple 
factor groups H k /H k ^ 1 is uniquely determined up to a permutation and 
up to isomorphism. You can look up a proof in a more advanced 
algebra text (e.g., my Algebra). Such a sequence already gives information 
about G. To get a full knowledge of G, one would have to know how 
these factor groups are pieced together. Usually it is not true that G is 
the direct product of all these factor groups. There is also the question of 
classifying all simple groups. In §7 we shall see how a finite abelian 
group is a direct product of cyclic factors. In Chapter VI, Theorem 3.8, 
you will find an example of a simple group. Of course, every cyclic group 
of prime order is simple. As an exercise, prove that a simple abelian 
group is cyclic of prime order or consists only of the unit element. 


II, §4. EXERCISES 

1. Let G be a group and H a subgroup. If x, ye G, define x to be equivalent to 
y if x is in the coset yH. Prove that this is an equivalence relation. 

2. Let /: G -> G' be a homomorphism with kernel H. Assume that G is finite. 

(a) Show that 


order of G = (order of image of /)(order of H). 

(b) Suppose that G, G' are finite groups and that the orders of G and G' are 
relatively prime. Prove that / is trivial, that is, /(G) = e\ the unit element 
of G'. 

3. The index formula of Theorem 4.3(4) holds even if G is not finite. All that 
one needs to assume is that H , K are of finite index. Namely, using only that 
assumption, prove that 


(G : K) =(G : H)(H : K). 


In fact, suppose that 


m r 

G = (J « f /7 and H = [j bjK 

i = 1 J= 1 

are coset decompositions of G with respect to H, and H with respect to K. 
Prove that 

G = (J tibjK 

ij 

is a coset decomposition of G with respect to K. Thus you have to prove 
that G is the union of the cosets * x b jK (i = l,...,m;y = l,...,r) and that these 
cosets are all distinct. 
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4. Let p be a prime number and let G be a group with a subgroup H of index p 
in G. Let S be a subgroup of G such that G S => H. Prove that S = H or 
S = G. (This theorem applies in particular if H is of index 2.) 

5. Show that the group of inner automorphisms is normal in the group of all 
automorphisms of a group G. 

6. Let H u H 2 be two normal subgroups of G. Show that H x n H 2 is normal. 

7. Let /: G -» G' be a homomorphism and let H' be a normal subgroup of G\ 

Let H = 

(a) Prove that H is normal in G. 

(b) Prove Theorem 4.8. 

8. Let f: G -> G' be a surjective homomorphism. Let if be a normal subgroup 
of G. Show that f{H) is a normal subgroup of G'. 

9. Prove Theorem 4.9. 

10. Let G be a group. Define the center of G to be the subset of all elements « in 

G such that «x = x« for all x e G. Show that the center is a subgroup, and 

that it is a normal subgroup. Show that it is the kernel of the conjugation 

homomorphism x i— ► y x in Exercise 5, §3. 

11. Let G be a group and H a subgroup. Let N n be the set of all xeG such that 
xHx x = H. Show that N H is a group containing H, and H is normal in N u . 
The group N H is called the normalizer of H . 

12. (a) Let G be the set of all maps of R into itself of type xh«x + /?, where 

* 6 R, * # 0 and be R. Show that G is a group under composition. We de- 
note such a map by Thus (T a ,b(x) = + b. 

(b) To each map c a b we associate the number «. Show that the association 

is a homomorphism of G into R*. Describe the kernel. 

13. View Z as a subgroup of the additive group of rational numbers Q. Show 
that given an element x e Q/Z there exists an integer n ^ 1 such that nx = 0. 

14. Let D be the subgroup of R generated by 2n. Let R + be the multiplicative 
group of positive real numbers, and C* the multiplicative group of non-zero 
complex numbers. Show that C* is isomorphic to R + x R/D under the 
map 

(r, 0)i-W 0 . 

(Of course, you must use properties of the complex exponential map.) 

15. Show that every coset of Z in R has a unique coset representative x such that 
0^x<l. [Hint: For each real number y , let n be the integer such that 
n^Ly <n + 1.] 

16. Show that every coset of R + in C* has a unique representative complex 
number of absolute value 1. 
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17. Let G be a group and let x ls ...,x r be a set of generators. Let H be a 
subgroup. 

(a) Assume that x t Hx^ 1 = H for i = 1 ,.../. Show that H is normal in G. 

(b) Suppose G is finite. Assume that x,/7x, 1 c= H for i = 1 ,.../. Show that 
H is normal in G. 

(c) Suppose that H is generated by elements yi,...,y m . Assume that 
x'U’jX. 1 e H for all i, j. Assume again that G is finite. Show that H is 
normal. 

18. Let G be the group of Exercise 8 of §1. Let H be the subgroup generated by 
x, so H = { e , x, x 2 , x 3 }. Prove that H is normal. 

19. Let G be the group of Exercise 9, §1, that is, G is the quaternion group. Let 
H be the subgroup generated by i, so H = {e, i, i 2 , i 3 }. Prove that H is 
normal. 

20. Let G be the group of Exercise 10, §1. Let H be the subgroup generated by 
x, so H = {e, x, ...,x 5 }. Prove that H is normal. 


Commutators and solvable groups 


21. (a) Let G be a commutative group, and H a subgroup. Show that G/H is 
commutative. 

(b) Let G be a group and H a normal subgroup. Show that G/H is 
commutative if and only if H contains all elements xyx -1 )^ 1 for x, yeG. 


Define the commutator subgroup G c to be the subgroup generated by all 
elements 


xyx x y 1 with x, yeG. 


Such elements are called commutators. 

22. (a) Show that the commutator subgroup is a normal subgroup. 

(b) Show that G/G c is abelian. 

23. Let G be a group, H a subgroup, and N a normal subgroup. Prove that if 
G/N is abelian, then H/(H n N) is abelian. 

24. (a) Let G be a solvable group, and H a subgroup. Prove that H is solvable, 
(b) Let G be a solvable group, and /: G -> G' a surjective homomorphism. 

Show that G' is solvable. 


25. (a) Prove that a simple finite abelian group is cyclic of prime order. 

(b) Let G be a finite abelian group. Prove that there exists a sequence of 
subgroups 


G H x H r = {e} 


such that Hi/H i + 1 is cyclic of prime order for all i. 

Conjugation 

26. Let G be a group. Let S be the set of subgroups of G. If H , K are subgroups 
of G, define H to be conjugate to K if there exists an element xeG such that 
xHx ~ 1 = K. Prove that conjugacy is an equivalence relation in S. 
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27. Let G be a group and S the set of subgroups of G. For each x e G, let 
c x : S — ► S be the map such that 


c X (H) = xHx l . 

Show that c x is a permutation of S, and that the map x > c x is a homomor- 
phism 


c: G — > Perm(S'). 


Translations 

28. Let G be a group and H a subgroup of G. Let S be the set of cosets of H in 
G. For each xeG, let T X :S->S be the map which to each coset yH 
associates the coset xyH. Prove that T x is a permutation of S, and that the 
map x i— ► T x is a homomorphism 

T: G -> Perm(S). 

29. Let H be a subgroup of G, of finite index rc. Let S be the set of cosets of H. 
Let 

T:x^T x 

be the homomorphism of G -» Perm(S) described in the preceding exercise. 
Let K be the kernel of this homomorphism. Prove: 

(a) K is contained in H. 

(b) #(G/K) divides n\. 

(c) If H is of index 2, then H is normal, and in fact, H = K. 

[Hint: use the index formula and prove that (H : K) = 1. If you get into 
trouble, look at Proposition 8.3.] 

30. Let G be a finite group and let p be the smallest prime number dividing the 
order of G. Let H be a subgroup of index p. Prove that H is normal in G. 
[Hint: Let S be the set of cosets of H. Consider the homomorphism 

x i— ► T x of G -> Perm(S). 

Let K be the kernel of this homomorphism. Use the preceding exercise and 
the index formula as well as Corollary 4.7.] 

31. Let G be a group and let H u H 2 be subgroups of finite index. Prove that 
H x n H 2 has finite index. In fact, prove something stronger. Prove that there 
exists a normal subgroup N of finite index such that N a H l n H 2 . [Hint: 
Let Si be the set of cosets of Hi and let S 2 be the set of cosets of H 2 . Let 
S = Si x S 2 . Define a map 


T: G -> Perm(S) = Perm^ x S 2 ) 
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by the formula 


T x (aH l , bH 2 ) = (.\ :aH x , .vfc/f 2 ), for \\ a, be G. 

Show that this map x\-^T x is a homomorphism. What is the order of SI 
Show that the kernel of this homomorphism is contained in H x n H 2 . Use 
Corollary 4.7.] 


II, §5. APPLICATION TO CYCLIC GROUPS 

Let G be a cyclic group and let * be a generator. Let dZ be the kernel of 
the homomorphism 


Z -> G such that n i— ► 

In case the kernel of this homomorphism is 0, we get an isomorphism of 
Z with G. In the second case, as an application of Corollary 4.7, when 
the kernel is not 0, we get an isomorphism 

Z/dZ A G. 


Theorem 5 . 1 . Any two cyclic groups of order d tire isomorphic. If * is 
0 generator of G of period d , then there is 0 unique isomorphism 

/: Z/dZ -► G 

such th0t /(l) = 0. If G u G 2 0 re cyclic of order d , 0 nd 0 U 0 2 0re 
generators of G u G 2 respectively , then there is 0 unique isomorphism 

g: Gj -*• G 2 

such that g{a x ) = a 2 . 

Proof. By the remarks before the theorem, let 

fx : Z/dZ G x and f 2 : Z/dZ -> G 2 
be isomorphisms such that f x (n) = 0 ” and f 2 (n) = 0 n 2 for all neZ. Then 

h = f 2 °fP’- G^G 2 
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is an isomorphism such that /i(ji x . Conversely, let g : Gj -► G 2 be an 

isomorphism such that g(*i) = * 2 . Then 

g(a\) = = a\ 

and therefore g = h, thereby proving the uniqueness. 

Remark. The uniqueness part of the preceding result is a special case 
of a very general principle: 


Theorem 5.2. A homomorphism is uniquely determined by its values on 0 
set of generators. 

We expand on what this means. Let G, G' be groups. Suppose that G 
is generated by a subset of elements S. In other words, every element of 
G can be written as a product 

x = x t • • • x r with x f eS or xf 1 e S. 

Let b i,..., b r be elements of G'. If there is a homomorphism 

/: G - G' 

such that /(x f ) = b t for i = l,...,r, then such a homomorphism is uniquely 
determined. In other words, if 


g: G -> G' 

is a homomorphism such that g(x/) = b t for i = 1 ,...,r then g = f. The 
proof is immediate, because for any element x written as above, 
x = x x ■ x r with x, g S or xf 1 g S, we have 


^(x) = g(x x ) • ■ • ^(x r ) = /(xj) • • • /(x r ) = /(x). 

Of course, given arbitrary elements b u . ..,b r e G' there does not nec- 
essarily exist a homomorphism f.G^G' such that /(x f ) = b t . Some- 
times such an / exists and sometimes it does not exist. Look at Exercise 
12 to see an example. 

Theorem 5.3. Let G be 0 cyclic group. Then 0 f 0 Ctor group of G is 
cyclic 0 nd 0 subgroup of G is cyclic. 

Proof. We leave the case of a factor group as an exercise. See 
Exercise 1 1 of §3. Let us prove that a subgroup of G is cyclic. Let 0 be a 
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generator of G, so that we have a surjective homomorphism /:Z->G 
such that J'(n) = Let H be a subgroup of G. Then f (the set of 

n e Z such that /(«) e H) is a subgroup /I of Z, and hence is cyclic. In 

fact, we know that there exists a unique integer d ^0 such that / -1 (//) 
consists of all integers which can be written in the form md with me Z. 
Since / is surjective, it follows that / maps A on all of H , i.e. every 
element of H is of the form * md with some integer m. It follows that H is 
cyclic, and in fact * d is a generator. In fact we have proved: 

Theorem 5.4. Let G be cyclic of order n, und let n = md be * factoriza- 

tion in positive integers m, d. Then G h*s * unique subgroup of order 
m . If G = <*>, then this subgroup is generated by * d = 


II, §5. EXERCISES 

1. Show that a group of order 4 is isomorphic to one of the following groups: 

(a) The group with two distinct elements a, b such that 

a 2 = b 2 = e and ab = ba. 

(b) The group G having an element a such that G = {e, a , a 2 , a 3 } and u 4 = e. 

2. Prove that every group of prime order is cyclic. 

3. Let G be a cyclic group of order n. For each aeZ define f a \G+G by 
faW = *“■ 

(a) Prove that f a is a homomorphism of G into itself. 

(b) Prove that f a is an automorphism of G if and only if a is prime to n. 

4. Again assume that G is a group of order p, prime. What is the order of 
Aut(G)? Proof? 

5. Let G and Z be cyclic groups of order n. Show that Hom(G, Z) is cyclic of 

order n. [Hint: If a is a generator of G, show that for each zeZ there exists 

a unique homomorphism /: G -* Z such that f(a) =z.] 

6. Let G be the group of Exercise 8, §1 and let H be the subgroup generated by 
x. Either recall or prove that H is normal. Show that G/H is cyclic of order 
2 . 

7. Let G be the group of Exercise 10, §1, and let H be the subgroup generated 
by x. Show that G/H is cyclic of order 2. 

8. Let Z be the center of a group G, and suppose that G/Z is cyclic. Prove that 
G is abelian. 

9. Let G be a finite group which contains a subgroup H, and assume that H is 
contained in the center of G. Assume that (G : H) =p for some prime number 
p. Prove that G is abelian. 
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10. Let G be the group of order 8 of Exercise 8 of §1, so G is generated by 
elements x, y satisfying x 4 = e = y 2 and yxy = x 3 . 

(a) Prove that all subgroups of G are those shown on the following diagram. 



(b) Determine all the normal subgroups of G. 

11. Let G be the quaternion group of Exercise 9 of §1. 

(a) Make up a lattice of all subgroups of G similar to the lattice of subgroups 
of the preceding exercise. 

(b) Prove that all subgroups of G are normal. 

12. (a) Let H, N be normal subgroups of a finite group G. Assume that the 

orders of H , N are relatively prime. Prove that xy = yx for all xeH and 
yeN, and that HN % H x N. [Hint: Show that xyx -1 y -1 eH n A.] 

(b) Let G be a finite group and let H u ...,H r be normal subgroups such that 
the order of H, is relatively prime to the order of Hj for i # j- Prove that 

H i “ H r ^H l x ••• x H r . 

13. Let G be a finite group. Let A be a normal subgroup such that N and G/N 
have orders relatively prime. Let H be a subgroup of G having the same 
order as G/N. Prove that G = HN. 

14. Let G be a finite group. Let A be a normal subgroup such that N and G/N 

have orders relatively prime. Let cp be an automorphism of G. Prove that 

q>(N) = N. 

15. Let G be a group and H an abelian normal subgroup. For xeG let c x denote 
conjugation. 

(a) Show that the association x »— * c x induces a homomorphism 

G/H -> Aut(tf). 

(b) Suppose that G is finite, that #(G/H) is relatively prime to #Aut (//), and 
that G/H is cyclic. Prove that G is abelian. 

16. Let G be a group of order p 2 . Prove that G is abelian. [Hint: Let aeG, 

a e. If a has period p 2 , we are done. Otherwise a has period p (why?). Let 

H = (a) be the subgroup generated by a. Let b e G and b$H. Prove that 
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G = (a, by, namely G is generated by a and b. Using Exercise 30 of §4, 
conclude that H is normal. Get a homomorphism G — ► Aut (//), and prove 
that the image of this homomorphism is trivial. Hence b commutes with a. 
Conclude that G is abelian by using Exercise 14 of §1.] 


II, §6. PERMUTATION GROUPS 

In this section, we investigate more closely the permutation group S n of n 
elements {1 = J n . This group is called the symmetric group. 

If oeS n , then we recall that o~ l : J n ^> J n is the permutation such that 
o~ 1 (k) = unique integer jeJ n such that o(j) = k. A transposition r is a 
permutation which interchanges two numbers and leaves the others fixed, 
i.e. there exist integers i, jeJ nf i^j such that t ( i)=j, i (j) = i, and 
t(/c) = k if k ^ i and k # j. One sees at once that if i is a transposition, 
then t -1 = t and t 2 = /. In particular, the inverse of a transposition is a 
transposition. We shall prove that the transpositions generate S n . 

Theorem 6.1. Every permutation of J n can be expressed as a product of 

transpositions. 

Proof We shall prove our assertion by induction on n. For n = 1, 
there is nothing to prove. Let n > 1 and assume the assertion proved for 
n — 1. Let a be a permutation of J n . Let cr(n) = k. Let i be the transpo- 
sition of J n such that t(/c) = n , t (n) = k. Then to is a permutation such 
that 

io(n) — t(/c) = n. 

In other words, to leaves n fixed. We may therefore view to as a permu- 
tation of J n ^i, and by induction, there exist transpositions of 

J n -i, leaving n fixed, such that 

to = Ti 


We can now write 


er = T 1 T 1 •••!,, 

Thereby proving our proposition. 

A permutation o of {L ...,w} is sometimes denoted by 

1 • • • n 

< t ( 1 ) o(n) 
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Thus 


"1 2 3“ 

_2 1 3 _ 

denotes the permutation a such that cr(l) = 2, <r(2) = 1, and cr(3) = 3. 
This permutation is in fact a transposition. 

Let i l9 ... 9 i r be distinct integers in J n . By the symbol 

[' I"'*,] 

we shall mean the permutation a such that 


a(i l ) = i 2 , a(i 2 ) = i 3 , ..., a(i r ) = i u 


and o leaves all other integers fixed. For example 

[132] 


denotes the permutation o such that <t( 1) = 3, (j(3) = 2, <r(2) = 1, and o 
leaves all other integers fixed. Such a permutation is called a cycle, or 
more precisely, an r-cycle. 

If a = [«!*• i r ] is a cycle, then one verifies at once that cr' 1 is also a 
cycle, and that in fact 

ff " 1 = Ur' ■■hi- 


Thus if a = [132] then 


a 1 = [231]. 


Note that a 2-cycle [(/] is nothing but a transposition, namely the trans- 
position such that i > j and j \— > i. 

A product of cycles is easily determined. For instance, 


[132] [34] =[2134]. 


One sees this using the definition: If o = [132] and t = [34], then for 
instance 

ct(t(3)) = <j( 4) = 4, 
ff( T (4)) = ff(3) = 2, 

= ct ( 2 ) = 1 , 

(t(t( 1)) = (j(l) = 3. 
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Let G be a group. Recall that G is solvable if and only if there exists a 
sequence of subgroups 

G = H 0 =>H 1 ^H 2 =>-.^H m = {e} 

such that Hi is normal in H i _ l and such that the factor group H i _ 1 /H i is 
abelian, for i = 1 We shall prove that for n ^ 5, the group is 

not solvable. We need some preliminaries. 

Theorem 6.2. Let G be a group , and H a normal subgroup. Then G/H 
is abelian if and only if H contains all elements of the form xyx~ l y~ 1 , 
with x, yeG. 

Proof Let / : G -► G/H be the canonical homomorphism. Assume that 
G/H is abelian. For any x, yeG we have 

f(xyx~ 1 y~ 1 ) = f(x)f{y)f(xy { f(y)'\ 

and since G/H is abelian, the expression on the right-hand side is equal 
to the unit element of G/H . Hence xyx~ l y~ i e H. Conversely, assume 
that this is the case for all x, yeG. Let x, y be elements of G/H. Since 
/ is surjective, there exists x, yeG such that x = /(x) and y = f(y). Let 
e be the unit element of G/H , and e the unit element of G. Then 

e = f(e ) = f(xyx l y~ l ) = f(x)f(y)f(x) “ 1 f(y) ~ 1 

= xyx~ i y~ 1 . 

Multiplying by y and x on the right, we find 

yx = xy, 

and hence G/H is abelian. 

Theorem 6.3. If n ^ 5, then S n is not solvable. 

Proof We shall first prove that if //, N are two subgroups of S n such 
that N a H and N is normal in H , if H contains every 3-cycle, and if 
H/N is abelian, then N contains every 3-cycle. To see this, let i, 7, k, r, 5 
be five distinct integers between 1 and n, and let 

= [f/’fe] and t = [/crs]. 


Then 


<7T<7 l T 1 = [(//c][/crs][/c/7][5r/c] 

= [rWJ 
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Since the choice of i, j, k , r, s was arbitrary, we see that the cycles [ rki ] 
all lie in N for all choices of distinct r, k , i thereby proving what we 
wanted. 

Now suppose that we have a chain of subgroups 

s„ = H 0 => H t => H 2 => ••• => H m = {e} 

such that H v is normal in // v _ x for v = l,...,ra, and HJH V _ 1 is abelian. 
Since contains every 3-cycle, we conclude that // x contains every 
3-cycle. By induction on v, we conclude that H m = {e} contains every 
3-cycle, which is impossible. Hence such a chain of subgroups cannot 
exist, and our theorem is proved. 


The sign of a permutation 

For the next theorem, we need to describe the operation of permutations 
on functions. Let / be a function of n real variables, so we can evaluate 

f(x u ...,x n ) for x x„eR. 

Let o be a permutation of J n . Then we define the function of by 

i, . . . j-Xfl) /(X ff(1 ), • • • 

Then we have for <r, z eS n : 

(1) (oz)f = o(zf). 

Proof. We use the definition applied to the function g = zf. Then 

Ct(t/)(Xi, . . . ,X n ) = T/(X ff(1) , . . . ,x m ) 

= J (x ff r(l)> • * • > X <rr(n)) 

= (((TT)/)(x 1 , . . . ,x„) 

thus proving (1). Note that the middle step comes from applying the 
definition 

l/Ivi, • • • ,y n ) = /(^td), • • * ,y z („)) with 

It is trivially verified that if I:J n ^>J n is the identity permutation, then 


( 2 ) 


// = /• 
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If /, g are two functions of n variables, then we may form their sum 
and product as usual. The sum f + g is defined by the rule 

(/ + </)(*!, • • • ,x„) = f(x 1, . . . ,x„) + g(x u . . . ,x„) 

and the product fg is defined by 

(fg)(x u . . . ,x„) = f(x u ... ,x„)g(x u . . . ,x n ). 


We claim that 

(3) a(f + g) = of + ag and a(fg) = (af)(ag). 

To see this, we have 

(<?(/ + g)Jx u ...,x n ) m (/ -f g)(x c{ x ff(n) ) 

AX* 1 ) , * ■ • T ^(^<7( !),••• ,'^'<r(n)) 

= (of)(x 1 , . . . ,x n ) + (<rg)(x l9 . . . ,x M ), 

thereby proving (3) for the sum. The formula for the product is done the 
same way. As a consequence, for every number c we have 


o(cf) = cof . 


Theorem 6.4. To each permutation o of J n it is possible to assign a sign 
1 or —1, denoted by e(o\ satisfying the following conditions : 

(a) If t is a transposition , r/ierc e(r) = — 1. 

(b) If a, a ore permutations of J n , 

6((7(7 / ) = e((7)e(<7'). 

Proo/ Let A be the function 

= n (xj ~ x t ), 
i<j 

the product being taken for all pairs of integers i, j satisfying 


1 ^ i < j ^ n. 
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Let t be a transposition, interchanging the two integers r and s. Say 
r < s. We wish to determine 


tA(x 19 • • • ,*n) = El ( X t(7) X T(i)) 

i <j 

= n t (- x j - *.■)• 

i<j 

For one factor, we have 


T( x s - X r) = ( x r - X s) = ~( x s — X r)‘ 

If a factor does not contain x r or x s in it, then it remains unchanged 
when we apply t. All other factors can be considered in pairs as follows: 

(X k - X s )(x k - x r ) if k > s, 

(x s - x k )(x k ~x r ) if r <k < s, 

(x s - x k )(x r - x k ) if k < r. 

Each one of these pairs remains unchanged when we apply t. Hence we 
see that 

tA = —A. 

Now let g be an arbitrary permutation. Clearly, crA = ± A. Define e ( g ) to 
be the sign 1 or —1 such that 


<j(A) = £(<t)A. 


It is now useful to use the notation n(o)f instead of of when applying a per- 
mutation to a function /. Thus o is in the group of permutations of 
{l,...,w} 3 whereas n(a) is in the group of permutations of functions of n 
variables. Then 


7i(gg') — 7l(G)7l(G'). 

It follows at once that g s(g) is a homomorphism of S n into {1,-1}. The 
theorem follows. 

In particular, if g is a product of transpositions 


G = 
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then 


<o) = ( -\r . 

Hence in any such product, m is either always odd or always even, depend- 
ing on We restate this as a corollary. 

Corollary 6.5. If a permutation o of J n is expressed as a product of 
transpositions , 

cr = ii •••!,, 

then s is even or odd according as c(<j) =1 or — 1 . 


Corollary 6.6. If a is a permutation of J„, then 

c(a) = eO -1 ). 


Proof We have 

1 = e(id) = c(gg~ 1 ) = e^cio 1 ). 

Hence either e(<j) and e(cr _1 ) are both equal to 1, or both equal to — 1, 
as desired. 

As a matter of terminology, a permutation is called even if its sign is 
1, and it is called odd if its sign is —1. Thus every transposition is odd. 
From Theorem 6.4 we see that the map 

is a homomorphism of S n onto the group consisting of the two elements 
1, — 1. The kernel of this homomorphism by definition consists of 
the even permutations, and is called the alternating group A n . If t is a 
transposition, then A n and zA n are obviously distinct cosets of A n , and 
every permutation lies in A n or zA n . (Proof: If creS n and cr$A n , then 
c(g) = — 1, so c(to) = \ and hence xaeA n , whence get~ 1 A n = xA n .) 
Therefore, 

A n , t A n 

are distinct cosets of A n in S n , and there is no other coset. Since A n is 
the kernel of a homomorphism, it is normal in S n . We have zA n = A n z, 
which can also be verified easily directly. 
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II, §6. EXERCISES 


1. Determine the sign of the following permutations. 


(a) 


(d) 


n 2 31 

ri 2 3i r 


L , 1 (c> 

|_2 3 lj | 

L3 1 2j L 

n 2 3 41 1 

ri 2 3 4i r 

„ „ , A e) 

1 , „ 1 (f) 

2 3 1 4j 

L2 1 4 3j L 


2 

2 

2 

2 


3 4 

4 1 I 


2. In each one of the cases of Exercise 1, write the inverse of the permutation. 

3. Show that the number of odd permutations of {1,...,«} for n^2 is equal to 
the number of even permutations. 


4. Show that the groups S 2 , S 3 , S 4 are solvable. [Hint: For S 4 , find a subgroup 

H of order 4 in A 4 . Consider the homomorphism of A 4 into S 3 given by 

translation on the cosets of H. Analyze the kernel of this homomorphism.] 

5. Let a be the r-cycle \_i l • ■•/,]. Show that e(a) = ( — l) r+1 . Hint : Use induction. 
If r = 2, then a is a transposition. If r > 2, then 

Pi = Pl*r]Pl”**r-l]. 

6. Two cycles [q • ■ ■ i r ] and [j l -yj are said to be disjoint if no integer i v is 
equal to any integer j M . Prove that a permutation is equal to a product of 
disjoint cycles. 

7. Express the permutations of Exercise 1 as a product of disjoint cycles. 

8. Show that the group of Exercise 8, §1 exists by exhibiting it as a subgroup of 

S 4 as follows. Let o’ =[1234] and t = [24]. Show that the subgroup 

generated by tr, r has order 8, and that cx, t satisfy the same relations as x, y 
in the exercise loc. cit. 


9. Show that the group of Exercise 10, §1 exists by exhibiting it as a subgroup 
of S 6 . 

10. Let n be an even positive integer. Show that there exists a group of order 2 n, 
generated by two elements tr, r such that a n = e = x 2 and or = Tcr n ~ l . [Also 
see Exercises 6, 7 of Chapter VI, §2.] Draw a picture of a regular n- gon, 
number the vertices, and use the picture as an inspiration to get a, t. 

1 1. Let G be a finite group of order 2k for some positive integer k. 

(a) Prove that G has an element of period 2. [Hint: show that there exists 
a e G, v t* e, such that .v = v 1 .] 

(b) Assume that k is odd. Let aeG have period 2 and let T a : G -> G be 
translation by a. Prove that T a is an odd permutation. 

12. Let G be a finite group of even order 2/c, and assume that k is odd. Prove 
that G has a normal subgroup of order k. [Hint : Use the previous exercise.] 
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II, §7. FINITE ABELIAN GROUPS 

The groups referred to in the title of this section occur so frequently that 
it is worth while to state a theorem which describes their structure com- 
pletely. Throughout this section we write our abelian groups additively. 

Let A be an abelian group. An element * e A is said to be a torsion 
element if it has finite period. The subset of all torsion elements of A is 
a subgroup of A called the torsion subgroup of A. (If a has period m and 
b has period n then, writing the group law additively, we see that * ± b 
has a period dividing mn.) Let p be a prime number. A p-group is a 
finite group whose order is a power of p. Finally, a group is said to be 
of exponent m if every element has period dividing m. 

If A is an abelian group and p a prime number, we denote by A(p) 
the subgroup of all elements xeA whose period is a power of p. Then 
A(p) is a torsion group, and is a p-group if it is finite. 

Theorem 7.1. Let A be nn ndditive nbelinn group of exponent n. If 
n = mm' is a factorization with (m, m') = 1 , then A is the direct sum 

A A m ® A )n ’. 

The group A is the direct sum of its subgroups A(p) for nil primes p 
dividing n. 

Proof For the first statement, since m, ml are relatively prime, there 
exist integers r, s such that 

rm + sm f = 1. 


Let x e A. Then 

(*) x = l.v = mix + sm'x. 

But rmx e A ni because m’rmx = rmmx = rnx = 0. Similarly, sm'x e A m . 
Hence A = A m + A m >. This sum is direct, for suppose .ve A m c\A m .. By 
the same formula (*) we see that x =0, so A m nA m - = {0}, whence the 
sum is direct. The final statement is obtained by writing n as a product 
of distinct prime power factors 


» = n ft- 

and using induction. We thus get A = © Aipf 

Note that if A is finite, we can take n equal to the order of A. 

Our next task is to describe the structure of finite abelian p- groups. 
Let r l5 ...,r s be integers ^1. A finite p- group A is said to be of type 
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(p ri ,...,p rs ) if A is isomorphic to the product of cyclic groups of orders 
P r '(i = 1, • • ■ ,s). 

Example. A group of type (p, p ) is isomorphic to a product of cyclic 
groups Z/pZ x Z/pZ. 

Theorem 7.2. Every finite abelian p-group is isomorphic to a product of 

cyclic p-groups. If it is of type (p ri , . . . ,p rs ) with 

1 , 

then the sequence of integers (r l9 ...,r s ) is uniquely determined. 

Proof Let A be a finite abelian p-group. We shall need the following 
remark: Let b be an element of A, b # 0. Let k be an integer ^ 0 such 
that p k b / 0, and let p m be the period of p k b. Then b has period p k + m . 
Proof: We certainly have p k + m b = 0, and if p n b = 0 then first n ^ /c, and 
second n ^ k + ra, otherwise the period of p k b would be smaller than p m . 

We shall now prove the existence of the desired product by induction. 
Let a i eA be an element of maximal period. We may assume without 
loss of generality that A is not cyclic. Let A 1 be the cyclic subgroup 
generated by a h say of period p ri . We need a lemma. 

Lemma 7.3. Let b be an element of A/A u of period p r . Then there ex- 

ists a representative a of b in A which also has period p r . 

Proof. Let b be any representative of b in A. Then p r b lies in A l9 say 

p r b = na 1 with some integer n ^ 0. If n = 0 we let a = b. Suppose n ± 0. 
We note that the period of b is ^ the period of b . Write n = p k t where t 
is prime to p. Then ta x is also a generator of A l9 and hence has period 
p ri . We may assume k ^ r x . Then p k ta x has period p n ~ k . By our 
previous remark, the element b has period 


whence by hypothesis, r + r 1 — k ^ and r ^ k. This proves that there 
exists an element ceA i such that p r b = p r c. Let a = b — c. Then a is a 
representative for b in A and p r a = 0. Since period(a) ^ p r we conclude 
that a has period equal to p r . 

We return to the main proof. By induction, the factor group A/A x 
is expressible as a direct sum 


Aj A x — A 2 ® • • • © A s 

of cyclic subgroups of orders p r2 , ...,p rs respectively, such that 
^ 2 = ” = r 5 - Let d t be a generator for A { (i = 2, ...,5) and let a { be a 
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representative in A having the same period as a { . Let A i be the cyclic 
subgroup generated by a t . We contend that A is the direct sum of 
A U ...,A S . 

We let A = A/A v First observe that since a ( and have the same 
period, the canonical homomorphism A -> A induces an isomorphism 
Aj^Aj for j = 2 ,...,s. 

Next we prove that A = A t + • • • + A s . Given xe A, let x denote its 
image in A/A 1 = A. Then there exist elements Xj e Aj for j = 2, . ..,s such 
that 

X = x 2 + • • • + x s . 

Hence x — x 2 — • — x s lies in A h so there exists an element x l sA l such 

that 

X = X^ -f- X2 + • ■ ■ T x s , 

which proves that A is the sum of the subgroups A t (i = 1, ..., 5 ). 

To prove that A is the direct sum, suppose xeA and 

x = x l + • • • + x s = y x + - - • + y s with x h y t G A t . 

Subtracting, and letting z { = y t — x h we find that 

0 = z 1 +---+z s with z i G /l,-. 

Then 

0 = z 2 + • • • + z s 

whence Zj = 0 for j = 2, . . . ,s because A is the direct sum of the subgroups 
Aj for j = 2, ... ,s. But since Aj « A p it follows that z 7 = 0 for j = 2, . . . ,s. 
Hence 0 = z x also. Hence, finally, x t = for i = 1,.. . ,s, which concludes 
the proof of the existence part of the theorem. 

We prove uniqueness, by induction on the order of A. Suppose that A 
is written in two ways as a product of cyclic groups, say of type 

(pV..,p r \) and 

with r { ^ ^ r s ^ 1 and Wj ^ ^ m k ^ 1. Then pA is also a p-group, 

of order strictly less than the order of A, and is of type 

(p-- 1 ,...^- 1 ) and 

it being understood that if some exponent or m, is equal to 1, then the 
factor corresponding to 


m, — 1 


or p 
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in pA is simply the trivial group 0. Let i = 1, . . . ,n be those integers such 
that ^ 2. Since #(pA) < #(A), by induction the subsequence 

( r l — 1 , . . . ,r s — 1 ) 

consisting of those integers ^ 1 is uniquely determined, and is the same 
as the corresponding subsequence of 

- 1 - 1 ). 

In other words, we have r { ■ — 1 = — 1 for all those integers i = 1, 

Hence r t = for / = 1, ...,w, and the two sequences 

(//’,...,//) and 

can differ only in their last components which are equal to p. These 
correspond to factors of type (p , . . . ,p) occurring say v times in the first 
sequence and p times in the second sequence. Thus A is of type 

(. P r \ ■ ■ ■ ,P r ", P, - ,P) and (p n , . . . ,p r \ p,... ,p). 

v times p times 

Hence the order of A is equal to 

p r,+, " + Y = pr, + -‘ + r n ^ 

whence v = p, and our theorem is proved. 


II, §7. EXERCISES 

1. Let A be a finite abelian group, B a subgroup, and C = A/B. Assume that 
the orders of B and C are relatively prime. Show that there is a subgroup C' 
of A isomorphic to C, such that 


A = B © C. 

2. Using the structure theorem for abelian groups, prove the following: 

(a) An abelian group is cyclic if and only if, for every prime p dividing the 
order of G, there is one and only one subgroup of order p. 

(b) Let G be a finite abelian group which is not cyclic. Then there exists 

a prime p such that G contains a subgroup C x x C 2 where and C 2 are 

cyclic of order p. 
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3. Let f:A^B be a homomorphism of abelian groups. Assume that there 
exists a homomorphism g:B-+A such that / -g = id B . 

(a) Prove that A is the direct sum 

A = Ker / © Im g. 

(b) Prove that / and g are inverse isomorphisms between g(B) and B. 

Note : You may wish to do the following exercises only after reading §3 of the next 
chapter , especially Exercises 7 through 12 of III, §3. Then the following Exercises 
5 and 6 are completely subsumed by Exercises 7 through 12 of Chapter III, §3. 

4. Define (Z/rcZ)* to be the set of elements in Z/nZ having coset representatives 
which are integers aeZ prime to n. Show that (Z/nZ)* is a multiplicative 
group. 

5. Let G be a multiplicative cyclic group of order N and let Aut(G) be its group 
of automorphisms. For each a e (Z/NZ)* let cr a : G -> G be the map such that 
<r a (w) = w a . Show that <r a eAut(G) and that the association 

ah->cr a 

is an isomorphism of (Z/NZ)* with Aut(G). 

6. (a) Let n be a positive integer, which can be written n = n x n 2 where n u n 2 are 

integers ^ 2 and relatively prime. Show that there is an isomorphism 

/: Z/nZ ^ZjnfL x Zfn 2 Z 

where the map / associates to each residue class a mod nZ the pair of 
classes 

a mod Z\-^(a mod nfZ, a mod n 2 Z). 

For the surjectivity, you will need the Chinese Remainder Theorem of 
Chapter I, §5, Exercise 5. 

(b) Extend the result to the case when n = n l n 2 ---n r is a product of pairwise 
relatively prime integers ^ 2. 

7. Let n be a positive integer, and write n as product of prime powers 


n= TIP? 

1=1 

Show that the map / of Exercise 6 restricts to an isomorphism of 
multiplicative groups 


(Z/nZ)* » (Z/tf Z)* 
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8. Let p be a prime number. Let 1 ^ k ^ m. Let l/ fc = U k (p m ) be the subset of 
(Z/p m Z)* consisting of those elements which have a representative in Z which 
is = 1 mod p k . Thus 


U k = 1 + p*Z mod p m Z. 

Show that U k is a subgroup of (Z/p m Z)*. Thus we have a sequence of 
subgroups 


17, =>17 2 =>.•■=> 17* 




9. (a) Let p be an odd prime and let m ^ 2. Show that for k ^ m — 1 in 
Exercise 8, the map 


Z/pZ -> UJU k + l 

given by (a mod p)\— ► 1 + p k a mod p fc+1 Z is an isomorphism. 

(b) Prove that the order of L7, is p m_1 . 

(c) Prove that [/, is cyclic and that 1 + p mod p m Z is a generator. 

[Hint: If p is odd, show by induction that for every positive integer r, 

(1 + p) pr = 1 + p' + 1 mod p r + 2 . 

Is this assertion still true if p = 2? State and prove the analogue of this 
assertion for (1 + 4) 2r .] 

10. Let p be an odd prime. Show that there is an isomorphism 

(Z/p m Z)* » (Z/pZ)* X 17,. 

1 1. (a) For the prime 2, and an integer m ^ 2, show that 

(Z/2 m Z)* « {1, -1} x U 2 . 

(b) Show that the group Ui — 1 + 4Z mod 2" l Z is cyclic, and that the class of 
1 + 4 is a generator. What is the order of U 2 in this case? 

12. Let c p be the Euler phi function, i.e. cp(n) is the order of (Z/nZ)*. 

(a) If p is a prime number, and r an integer > 1, show that 

(p(p r ) =(p - 1 )p ri - 

(b) Prove that if m, n are positive relatively prime integers, then 

(p(mn) = (p(m)(p(n). 


(You may (should) of course use previous exercises to do this.) 
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II, §8. OPERATION OF A GROUP ON A SET 

In some sense, this section is a continuation of §6. We shall define 
a general notion which contains permutation groups as a special 
case, and which has already been illustrated in the exercises of §3 
and §4. 

Let G be a group and let S be a set. An operation or an action 
of G on S is a homomorphism 

7t: G — Perm(S) 

of G into the group of permutations of S. We denote the permutation 
associated with an element xeG by n x . Thus the homomorphism is 
denoted by 


XI— 7t x . 

Given seS the image of s under the permutation n x is 7r x (s). From 
such an operation, we obtain a mapping 

G x S — * S 

which to each pair (x, s ) with xeG and seS associates the element 
7r x (s) of S. Sometimes we abbreviate the notation and write simply 
xs instead of n x (s). With this simpler notation, we have the two 
properties: 

OP 1. For all x, y e G and s e S we have associativity 

*(ys) = (x>>)s. 

OP 2. If e is the unit element of G, then es = s for all s e S. 

Note that the formula of OP 1 is simply an abbreviation for the property 

^xy tt x Tty 

Similarly, the formula OP 2 is an abbreviation for the property that n e is 
the identity permutation, namely 

7r e (s) = s for all seS. 

Conversely, if we are given a mapping 

G x S — S denoted by (x, s) i— xs, 
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satisfying OP 1 and OP 2, then for each xeG the map si — >xs is a 
permutation of S which we may denote by n x (s). Then x i— > n x is a 
homomorphism of G into Perm(S). (Proof?) So an operation of G on a 
set S could also be defined as a mapping G x S^S satisfying properties 
OP 1 and OP 2. Hence we often use the abbreviated notation xs instead 
of 7c x (s) for the operation. 

We shall now give examples of operations. 


1. Translations. In Example 7 of §3 we have met translations: for each 
x g G we defined the translation 

T x : G^G by T x (y ) = xy. 

Thus we got a homomorphism x i— ► 7^ of G into Perm(G). Of course T x is 
not a group homomorphism, it is only a permutation of G. 

Similarly, G operates by translation on the set of subsets, for if A is a 
subset of G, then T X (A) = xA is a subset. If H is a subgroup of G, then 
T X (H) = xH is a coset of H , and G operates by translation on the set of 
cosets of H. You should previously have worked out several exercises 
concerning this operation, although we did not call it by that name. 

2. Conjugation. For each xeG we let c(x): G — > G be the map such that 
c(x)(}>) — x^x -1 . Then we have seen in Example 8 of §3 that the map 

X I— » c(x) 

is a homomorphism of G into Aut(G), and so this map gives an operation 
of G on itself by conjugation. The kernel of this homomorphism is the 
set of all xeG such that xyx ~ 1 = y for all y e G, so the kernel is what we 
called the center of G. 

We note that G also operates by conjugation on the set of subgroups 
of G, because the conjugate of a subgroup is a subgroup. We have met 
this operation before, although we did not call it by that name, and you 
should have worked out exercises on this operation in §4. In the case of 
conjugation, we do not use the notation of OP 1 and OP 2, because it 
would lead to confusion if we write xH for conjugation. We reserve xH 
for the translation of H by x. For conjugation of H by x we write c (x)(H). 

3. Example from linear algebra. Let R" be the vector space of column 
vectors in n-space. Let G be the set of n x n matrices which are 
invertible. Then G is a multiplicative group, and G operates on R”. For 
AeG and XeR" we have the linear map L A : R" -► R n such that 


L a (X) = AX. 
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The map A \— ► L A is a homomorphism of G into the multiplicative group 
of invertible linear maps of R" onto itself. Since we write frequently AX 
instead of L A (X), the notation 


(A, X)v->AX 

is particularly useful in this context, where we see directly that the two 
properties OP 1 and OP 2 are satisfied. 

For other examples, see the exercises of Chapter VI, §3. 

Suppose we have an operation of G on S. Let seS. We define the 
isotropy group of s e S to be the set of elements x e G such that = s. 
We denote the isotropy group by G s , and we leave it as an exercise to 
verify that indeed, the isotropy group G s is a subgroup of G. 

Examples. Let G be a group and H a subgroup. Let G operate on the 
set of cosets of H by translation. Then the isotropy group of H is H 
itself, because if xeG, then xH = H if and only if xeH. 

Suppose next that G operates on itself by conjugation. Then the 
isotropy group of an element a e G is called the centralizer of a . In this 
case, the centralizer consists of all elements x e G such that x commutes 
with a , that is 

xax ~ 1 = a or xa = ax. 

If we view G as operating by conjugation on the set of subgroups, then 
the isotropy group of a subgroup H is what we called the normalizer of 
H. 


Let G operate on a set S. We use the notation of OP 1 and OP 2. 
Let seS. The subset of S consisting of all elements xs with x e G is 
called the orbit of 5 under G, and is denoted by Gs. Let us denote 
this orbit by O. Let teO. Then 


O = Gs = Gt. 


This is easily seen, because teO = Gs means that there exists x e G such 
that xs = r, and then 


Gt = Gxs = Gs because Gx = G. 

An element teO is called a representative of the orbit, and we say that t 
represents the orbit. Note that the notion of orbit is analogous to the 
notion of coset, and the notion of representative of an orbit is analogous 
to the notion of coset representative. Compare the formalism of orbits 
with the formalism of cosets in §4. 
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Example. Let G operate on itself by conjugation. Then the orbit of an 
element x is called a conjugacy class, and consists of all elements 

yxy~ l with yeG. 

In general, let G operate on a set S. Let seS. If x, y are in the same 
coset of the subgroup G s then xs = ys. Indeed, we can write y = xh with 
he G s , so 


ys = xhs — xs since hs = s by assumption. 

Therefore we can define a mapping 

/: G/G s -> S by f(xG s ) = xs. 

We say that / is induced by f. 

Proposition 8.1. Let G operate on a set S, and let seS. 

(1) The map xh->xs induces a bijection between G/G s and the orbit Gs. 

(2) The order of the orbit Gs is equal to the index (G: GJ. 

Proof. The image of / is the orbit of s since it consists of all elements 
xs of S with x e G. The map / is injective, because if x, yeG and 
xs = ys, then x _1 ys = s so x _1 yeG s , whence yexG s , and x, y lie in the 
same coset of G s . Hence /(xGJ = f(yG s ) implies xs = ys, which implies 
xG s = yG s , which concludes the proof. 

In particular, when G operates by conjugation on the set of subgroups, 
and H is a subgroup, or when G operates on itself by conjugation, we 
obtain from Proposition 8.1 and the definitions: 

Proposition 8.2 

(a) The number of conjugate subgroups to H is equal to the index of the 
normalizer of H. 

(b) Let xeG. The number of elements in the conjugacy class of x is 
the index of the centralizer (G : G J. 

The next result gives a very good test for normality. You should have 
done it as an exercise before, but we now do it as an example. 

Proposition 8.3. Let G be a group and H a subgroup of index 2. Then 

H is normal. 
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Proof. Let S be the set of cosets of H and let G operate on S by 
translation. For each xeG let T X :S->S be the translation such that 
T x (aH) = xaH. Then 

x I— > T x is a homomorphism of G -> Perm(S). 

Let K be the kernel. If xe K, then in particular, T X (H) = H , so xH = H 
and xeH. Therefore K c= H. Then G/K is embedded as a subgroup of 
Perm(S), and Perm(S) has order 2, because S has order 2. Hence 
(G : K) = 1 or 2. But 


(G:K) = (G:H)(H:K\ 

and (G: H) = 2. Hence (H :K) = 1, whence H = K, whence H is normal 
because K is normal. This concludes the proof. 

Proposition 8.4, Let G operate on a set S. Then two orbits of G are 
either disjoint or are equal. 

Proof. Let Gs and Gt be two orbits with an element in common. This 
element can be written 


xs = yt with some x, ye G. 


Hence 


Gs = Gxs = Gyt = Gt 


so the two orbits are equal, thus concluding the proof. 

It follows from Proposition 8.4 that S is the disjoint union of the 
distinct orbits, and we can write 


S = (J Gsi (disjoint) 

is/ 


where / is some indexing set, and the s t represent the distinct orbits. 

Suppose that S is a finite set. Let #(S) be the number of elements of 
S. We call #(S) the order of S. Then we get a decomposition of the 
order of S as a sum of orders of orbits, which we call the orbit 
decomposition formula; namely by Proposition 8.1 : 


#(S)= i(G:G s ) 

i = 1 
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Example. Let G operate on itself by conjugation. An element xeG is 
in the center of G if and only if the orbit of x is x itself, and thus has 
one element. In general, the order of the orbit of x is equal to the index 
of the centralizer of x. Thus we obtain: 

Proposition 8.5. Let G be a finite group. Let G x be the centralizer of x. 
Let Z be the center of G and let Vi,..-,y w represent the conjugacy 
classes which contain more than one element. Then 


(G:l) = (Z:l)+ I (G : G y ) 

i= 1 


and (G : G y ) > 1 for i = l, ... ,m. 

The formula of Proposition 8.5 is called the class formula or class 
equation. 


II, §8. EXERCISES 

1. Let a group G operate on a set S. Let s, t be elements of S, and let xeG be 
such that xs = t. Show that 


G t = xG s x~K 

2. Let G operate on a set S, and let n: G -► Perm(S) be the corresponding 

homomorphism. Let K be the kernel of n. Prove that K is the intersection 

of all isotropy groups G s for all se S. In particular, K is contained in every 
isotropy group. 

3. (a) Let G be a group of order p n where p is prime and n > 0. Prove that G 

has a non-trivial center, i.e. the center of G is larger than {e}. 

(b) Prove that G is solvable. 

4. Let G be a finite group operating on a finite set S. 

(a) For each seS prove that 


y — = i 

teGs #(Gt ) 

(b) I..s(l/#(Gs)) = number of orbits of G in S. 


5. Let G be a finite group operating on a finite set S. For each x e G define 
a(x) = number of elements s e S such that xs = s. Prove that the number of 
orbits of G in S is equal to 


1 

#(G) 


I 


2(V). 
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Hint: Let T be the subset of G x S consisting of all elements (x, s) such that 
xs = s. Count the order of T in two ways. 

6. Let S, T be sets and let M(S, T) denote the set of all mappings of S into T. 
Let G be a finite group operating on S. For each map f\S->T and xeG 
define the map n x f: S -> T by 


(**/Xs) = /(x ‘4 


(a) Prove that x\-m x is an operation of G on M(S, T). 

(b) Assume that S , T are finite. Let n(x) denote the number of orbits of the 
cyclic group <x> on S. Prove that the number of orbits of G in M(S, T) is 
equal to 


1 

#(G) 


I ( # 7T'W. 


II, §9. SYLOW SUBGROUPS 

Let p be a prime number. By a />-group, we mean a finite group whose 
order is a power of p (i.e. p n for some integer n > 0). Let G be a finite 
group and H a subgroup. We call H a /^-subgroup of G if H is a p-group. 

Theorem 9.1. Let G be a p-group and not trivial , i.e. # {e}. Then: 

(1) G has a non-trivial center. 

(2) G is solvable. 

Proof. For (1) we use the class equation. Let Z be the center of G. 
Then 

(G:l) = (Z:l)+X(G:G x ,), 

where the sum is taken over a finite number of elements x t with 
(G:G X ) 7 * 1. Since G is a p-group, it follows that p divides (G: 1) and also 
(G:G X ). Hence p divides (Z:l), so the center is not trivial. 

For (2), #(G/Z) divides #(G) so G/Z is a p-group, and by (1), we 
know that #(G/Z) < #(G). By induction G/Z is solvable. By Theorem 
4.10 it follows that G is solvable. 

Let G be a finite group and let H be a subgroup. Let p be a prime 
number. We say that H is a /i-Sylow subgroup if the order of H is p n and 
if p n is the highest power of p dividing the order of G. We shall prove 
below that such subgroups always exist. For this we need a lemma. 

Lemma 9.2. Let G be a finite abelian group of order m, let p be a prime 
number dividing m. Then G has a subgroup of order p. 
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Proof. Write m = p r s where s is prime to p. By Theorem 7.1 we can 
write G as a direct product 


G = G(p) x G' 

where the order of G' is prime to p. Let a e G(p), a / e. Let p k be the 
period of a. Let 

b = a pk ~\ 

Then b # e but b p — e , and b generates a subgroup of order p as desired. 

Theorem 9.3. Let G be a finite group and p a prime number dividing the 
order of G. Then there exists a p-Sylow subgroup of G. 

Proof. By induction on the order of G. If the order of G is prime, our 
assertion is obvious. We now assume given a finite group G, and assume 
the theorem proved for all groups of order smaller than that of G. If 
there exists a proper subgroup H of G whose index is prime to p, then a 
p-Sylow subgroup of H will also be one of G, and our assertion follows 
by induction. We may therefore assume that every proper subgroup has 
an index divisible by p. We now let G act on itself by conjugation. From 
the class formula we obtain 


(G:1) = (Z:1) + I(G:G,> 

i = 1 

Here, Z is the center of G, and the term (Z:l) is the number of orbits 
having one element. The sum on the right is taken over the other orbits, 
and each index (G : G y ) is then > 1, hence divisible by p. Since p divides 
the order of G, it follows that p divides the order of Z, hence in 
particular that G has a non-trivial center. 

By Lemma 9.2, let a be an element of period p in Z, and let H be the 
cyclic group generated by a. Since H is contained in Z, H is normal. Let 
/: G -► G/H be the canonical homomorphism. Let p n be the highest 
power of p dividing (G : 1). Then p” _1 divides the order of G/H. By 
induction on the order of the group, there is a p-Sylow subgroup K' of 
G/H. Let K = f~ 1 (K'). Then K =) H and / maps K onto K f . Hence we 
have an isomorphism K/H « K'. Hence K has order p n ~ l p = p n , as 
desired. 

Theorem 9.4. Let G be a finite group. 

(i) If H is a p-sub group of G, then H is contained in some p-Sylow 
subgroup. 

(ii) All p-Sylow subgroups are conjugate. 

(iii) The number of p-Sylow subgroups of G is =1 mod p. 
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Proof. All proofs are applications of the technique of the class 
formula. We let S be the set of p-Sylow subgroups of G. Then G 
operates on S by conjugation. 

Proof of (i). Let P be one of the p-Sylow subgroups of G. Let S 0 be 
the G-orbit of P. Let G P be the normalizer of P. Then 

#(S 0 ) = (G:G P ) 

and G P contains P, so # (S 0 ) is not divisible by p. Let H be a 
p-subgroup of G of order > 1. Then H operates by conjugation on S 0 , 
and hence S 0 itself breaks up into a disjoint union of //-orbits. Since 
#(//) is a p- power, the index in H of any proper subgroup of H is 
divisible by p. But we have the orbit decomposition formula 


#(S 0 )= i(H:H P ). 

1 

Since #(S 0 ) is prime to p, it follows that one of the //-orbits in S 0 must 
consist of only one element, namely a certain Sylow subgroup P'. Then 
H is contained in the normalizer of P\ and hence HP' is a subgroup of 
G. Furthermore, F is normal in HP'. Since 

HP'/F w H/(H n P), 

it follows that the order of HP' fP' is a power of p, and therefore so is 
the order of HP'. Since P' is a maximal p- subgroup of G, we must have 
HP' = P', and hence H c= P\ which proves (i). 

Remark. We have also proved: 

Let H be a p-subgroup of G and suppose that H is contained in the 
normalizer of a p-Sylow group P'. Then H c: F. 

Proof of (ii). Let H be a p-Sylow subgroup of G. We have shown that 
H is contained in some conjugate F of P, and is therefore equal to that 
conjugate because the orders of H and P' are equal. This proves (ii). 

Proof of (iii). Finally, take H = P itself. Then one orbit of H in S has 
exactly one element, namely P itself. Let S' be another orbit of H in S. 
Then S' cannot have just one element P', otherwise H is contained in the 
normalizer of P', and by the remark H = P'. Let s' e S'. Then the 
isotropy group of s' cannot be all of //, and so the index in H of the 
isotropy group is > 1, and is divisible by p since H is a p- group. 
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Consequently the number of elements in S' is divisible by p. Hence we 
obtain 

#(S) = 1 + indices divisible by p 
= 1 mod p. 

This proves (iii). 

II, §9. EXERCISES 

1. Let p be a prime number. What is the order of the p-Sylow subgroup of the 
symmetric group on p elements? 

2. Prove that a group of order 15 is abelian, and in fact cyclic. 

3. Let S 3 be the symmetric group on 3 elements. 

(a) Determine the 2-Sylow subgroups of S 3 . 

(b) Determine the 3-Sylow subgroups of S 3 . 

4. (a) Prove that a group of order 12 is solvable. 

(b) Prove that S 4 is solvable. 

5. Let G be a group of order p 3 which is not abelian. 

(a) Show that the center has order p. 

(b) Let Z be the center. Show that G/Z ^CxC, where C is a cyclic group 
of order p. 

6. (a) Let H be a subgroup of a finite group G. Let P be a Sylow subgroup of 

G. Prove that there exists a conjugate F of P in G such that F n H is a 

p-Sylow subgroup of H. 

(b) Assume H normal and (G : H) prime to p. Prove that H contains every 
p-Sylow subgroup of G. 

7. Let G be a group of order 6, and assume that G is not commutative. Show 

that G is isomorphic to S 3 . [Hint: Show that G contains elements a, b such 

that a 2 = e, b 3 = e , and aba = b 2 = b~ x .~\ 

8. Let G be a non-commutative group of order 8. Show that G is isomorphic to 
the group of symmetries of the square, or to the quaternion group, in other 
words G is isomorphic to one of the groups of Exercises 8 and 9 of §1. 

9. Let G be a finite group and assume that all the Sylow subgroups are normal. 
Let P l5 ...,P, be the Sylow subgroups. Prove that the map 

Pj x x P f ->G given by (x 1} . ..,x r )i— >x x • x r 

is an isomorphism. So G is isomorphic to the direct product of its Sylow 
subgroups. [Hint : Recall Exercise 12 of §5.] 

10. Let G be a finite group of order pq, where p, q are prime and p < q. Suppose 
that q ^ 1 mod p. Prove that G is abelian, and in fact cyclic. 

11. Let G be a finite group. Let N(H) denote the normalizer of a subgroup H. 
Let P be a p-Sylow subgroup of G. Prove that N{N(P)) = N(P). 


CHAPTER III 


Rings 


In this chapter , we axiomatize the notions of addition and multiplication. 


Ill, §1. RINGS 

A ring R is a set, whose objects can be added and multiplied (i.e. we are 
given associations (x, y) i— > x + y and (x, y ) i— > xy from pairs of elements 
of R, into R ), satisfying the following conditions: 

RI 1. Under addition , R is an additive ( abelian ) group. 

RI 2. For all x, y, z e R we have 

x(y + z) = xy + xz and ( y + z)x = yx + zx. 

RI 3. For all x, y,z e R, we have associativity ( xy)z = x(yz). 

RI 4. There exists an element eeR such that ex = xe = x for all xeR. 

Example 1. Let R be the integers Z. Then R is a ring. 

Example 2. The rational numbers, the real numbers, and the complex 
numbers all are rings. 

Example 3. Let R be the set of continuous real-valued functions on 
the interval [0, 1]. The sum and product of two functions /, g are de- 
fined as usual, namely (/ + g)(t) = f(t) + g(t\ and (fg)(t) = f(t)g(t\ 

Then R is a ring. 
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More generally, let S be a non-empty set, and let R be a ring. Let 
M(S, R ) be the set of mappings of S into R. Then M(S, R) is a ring, if 
we define the addition and product of mappings /, g by the rules 

(/ + g)(x) = f(x) + g(x) and (fg)(x) = f(x)g(x). 

We leave the verification as a simple exercise for the reader. 

Example 4 (The ring of endomorphisms). Let A be an abelian group. 
Let End(,4) denote the set of homomorphisms of A into itself. We call 
End(/4) the set of endomorphisms of A. Thus End(/4) = Hom(/l, A) in the 
notation of Chapter II, §3. We know that End(/4) is an additive group. 

If we let the multiplicative law of composition on End(/4) be ordinary 

composition of mappings , then End(/4) is a ring. 

We prove this in detail. We already know RI 1. As for RI 2, let /, g, 
heEnd(A). Then for all xeA, 

( f°(g + h))(x)=f((g + h)(x )) 

= f(g(x) + h(x)) = f(g(x)) + f(h{x)) 

= f°9(x ) +f° h(x). 

Hence f»(g + h)=fog + f *h. Similarly on the other side. We observe 
that RI 3 is nothing but the associativity for composition of mappings in 
this case, and we already know it. The unit element of RI 4 is the iden- 
tity mapping /. Thus we have seen that End(/4) is a ring. 

A ring R is said to be commutative if xy = yx for all x, yeR. The 
rings of Examples 1, 2, 3 are commutative. In general, the ring of Exam- 
ple 4 is not commutative. 

As with groups, the element e of a ring R satisfying RI 4 is unique, 
and is called the unit element of the ring. It is often denoted by 1. Note 
that if 1 = 0 in the ring R , then R consists of 0 alone, in which case it is 
called the zero ring. 

In a ring R, a number of ordinary rules of arithmetic can be deduced 
from the axioms. We shall list these. 

We have Ox = 0 for all xeR. 

Proof We have 

Ox + x = Ox + ex = (0 + e)x = ex = x. 


Hence Ox = 0. 
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We have ( — e)x = ~x for all xeR. 

Proof. 

( — e)x + x = ( — e)x + ex = ( — e + e)x = Ox = 0. 
We have ( — e){ — e) = e . 

Proof We multiply the equation 

e + ( — e) = 0 

by — e, and find 

— e -f ( — e)( — e) = 0. 

Adding e to both sides yields ( — e)( — e) = e , as desired. 

We leave it as an exercise to prove that 

( — x)y = — xy and ( — x)( — y) = xy 


for all x, yeR. 

From condition RI 2, which is called the distributive law, we can de- 
duce the analogous rule with several elements, namely if x, y y„ are 
elements of the ring R , then 

*(>’i + • • • + y n ) = xy t + ■ ■ ■ + xy n . 

Similarly, if x l 5 ...,x m are elements of R , then 


(xi + • • • + x m )(yi +■■■ + y n ) = xiy t +--- + x m y„ 

m n 

= t t x <yy 

i = 1 7=1 

The sum on the right hand side is to be taken over all indices i and j as 
indicated. These more general rules can be proved by induction, and we 
shall omit the proofs, which are tedious. 

Let R be a ring. By a subring R' of R one means a subset of R such 
that the unit element of R is in R\ and if x, yeR\ then — x, x 4- y, and 
xy are also in R f . It then follows obviously that R f is a ring, the opera- 
tions of addition and multiplication in R f being the same as those in R. 

Example 5. The integers form a subring of the rational numbers, 
which form a subring of the real numbers. 
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Example 6. The real-valued differentiable functions on R form a sub- 
ring of the ring of continuous functions. 

Let R be a ring. It may happen that there exist elements x, yeR such 
that x / 0 and y ^ 0, but xy — 0. Such elements are called divisors of 
zero. A commutative ring without divisors of zero, and such that 1 / 0 
is called an integral ring. A commutative ring such that the subset of 
nonzero elements form a group under multiplication is called a field. 
Observe that in a field, we have necessarily 1 / 0, and that a field has no 
divisors of zero (proof?). 

Example 7. The integers Z form an integral ring. Every field is an 
integral ring. We shall see later that the polynomials over a field form an 
integral ring. 

Let R be a ring. We denote by R* the set of elements xeR such that 
there exists yeR such that xy — yx = e. In other words, R* is the set of 
elements x e R which have a multiplicative inverse. The elements of R* 
are called the units of R. We leave it as an exercise to show that the 
units form a multiplicative group. As an example, the units of a field 
form the group of non-zero elements of the field. 

Let R be a ring, and x e R. If n is a positive integer, we define 

x" = x ••• x, 

the product being taken n times. Then for positive integers m, n we have 
x" + m = x"x m and (x m ) n = x m ". 


Ill, §1. EXERCISES 

1. Let p be a prime number. Let R be the subset of all rational numbers m/n 
such that n ^ 0 and n is not divisible by p. Show that R is a ring. 

2. How would you describe the units in Exercise 1? Prove whatever assertion 
you make. 

3. Let R be an integral ring. If a , b, ceR, a / 0, and ab = ac , then prove that 
b = c. 

4. Let R be an integral ring, and aeR, a ^ 0. Show that the map x\->ax is an 
injective mapping of R into itself. 

5. Let R be a finite integral ring. Show that R is a field. [Hint: Use the 
preceding exercise.] 

6. Let R be a ring such that x 2 = x for all xeR. Show that R is commutative. 
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7. Let R be a ring and xeR. We define x to be nilpotent if there exists a 
positive integer n such that x n = 0. If x is nilpotent, prove that 1 + x is a 
unit, and so is 1 — x. 

8. Prove in detail that the units of a ring form a multiplicative group. 

9. Let R be a ring, and Z the set of all elements aeR such that ax = xa for all 

x e R. Show that Z is a subring of R , called the center of R. 

10. Let R be the set of numbers of type a + b-J 2 where a , b are rational 
numbers. Show that R is a ring, and in fact that R is a field. 

11. Let R be the set of numbers of type a + by/ 2 where a , b are integers. Show 
that R is a ring, but not a field. 

1 2. Let R be the set of numbers of type a + bi where a, b are integers and 

i = yj — 1. Show that R is a ring. List all its units. 

13. Let R be the set of numbers of type a -f bi where a, b are rational numbers. 

Show that R is a field. 

14. Let S be a set, R a ring, and /: S -> R a bijective mapping. For each x, yeS 
define 


x + y=f ‘(/(*) +/(>')) and x>' = / ‘(/(v )/(>’)). 

Show that these sum and product define a ring structure on S. 

15. In a ring R it may happen that a product xy is equal to 0 but x # 0 and 
y 0. Give an example of this in the ring of n x n matrices over a field K. 
Also give an example in the ring of continuous functions on the interval 
[0, 1]. [In this exercise we assume that you know matrices and continuous 
functions. For matrices, see Chapter V, §3.] 


Ill, §2. IDEALS 

Let R be a ring. A left ideal of R is a subset J of R having the follow- 
ing properties: If x, ye J, then x + ye J also, the zero element is in J, 
and if xeJ and aeR, then axeJ. 

Using the negative — e , we see that if J is a left ideal, and xeJ, then 
— xe J also, because — x = ( — e)x. Thus the elements of a left ideal form 
an additive subgroup of R and we may as well say that a left ideal is an 
additive subgroup J of R such that, if xeJ and aeR then axeJ. 

We note that R is a left ideal, called the unit ideal, and so is the sub- 
set of R consisting of 0 alone. We have J = R if and only if 1 eJ. 

Similarly, we can define a right ideal and a two-sided ideal. Thus a 
two-sided ideal J is by definition an additive subgroup of R such that, if 
xeJ and aeR, then ax and xaeJ. 
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Example 1. Let R be the ring of continuous real-valued functions on 
the interval [0, 1]. Let J be the subset of functions / such that = 0. 
Then J is an ideal (two-sided, since R is commutative). 

Example 2. Let R be the ring of integers Z. Then the even integers, 
i.e. the integers of type 2 n with ne Z, form an ideal. Do the odd integers 
form an ideal? 

Example 3. Let R be a ring, and a an element of R. The set of ele- 
ments xa , with xeR, is a left ideal, called the principal left ideal gener- 
ated by a. (Verify in detail that it is a left ideal.) We denote it by (a). 
More generally, let a l9 ... 9 a n be elements of R. The set of all elements 


x l a 1 + •■• + x n a n 

with x t eR , is a left ideal, denoted by (a u ... 9 a n ). We call a l9 ... 9 a n gener- 
ators for this ideal. 

We shall give a complete proof for this to show how easy it is, and 
leave the proof of further statements in the next examples as exercises. 
If x l9 ... 9 x n eR then 


(■ x i a i + ••• + x n a n ) + (y^ + ••• + y n d„) 


= x 1 a 1 + y 1 a l + • • • + x n a n + y n a n 


If zeR, then 


= (*i + y iK + * * * + (x„ + y n )a H . 


Finally, 


z(x i a l -I 1- x n a n ) = zx 1 a i H + zx n a n . 


0 = 0a l +-+0a n . 


This proves that the set of all elements x 1 a 1 + + x n a n with x t eR is a 

left ideal. 


Example 4. Let R be a ring. Let L, M be left ideals. We denote by 
LM the set of all elements x l y l + • •• + x n y n with x ( eL and y { eM. It is 
an easy exercise for the reader to verify that LM is also a left ideal. 
Verify also that if L, M, N are left ideals, then (LM)N = L(MN). 

Example 5. Let L, M be left ideals. We define L + M to be the sub- 
set consisting of all elements x + y with xeL and yeM. Then L + M is 
a left ideal. Besides verifying this in detail, also show that if L, M, N are 
left ideals, then 


L(M + N) = LM + LW. 
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Also formulate and prove the analogues of Examples 4 and 5 for right, 
and two-sided ideals. 

Example 6. Let L be a left ideal, and denote by LR the set of ele- 
ments x 1 y l + with x t eL and y t ^R. Then LR is a two-sided 

ideal. The proof is again left as an exercise. 

Example 7. In Theorem 3.1 of Chapter I we proved that every ideal 
of Z is principal. 

Ill, §2. EXERCISES 

1. Show that a field has no ideal other than the zero and unit ideal. 

2. Let R be a commutative ring. If M is an ideal, abbreviate MM by M 1 . Let 
M ls M 2 be two ideals such that M x + M 2 = R. Show that M\ + M\ = R. 

3. Let R be the ring of Exercise 1 in the preceding section. Show that the subset 
of elements m/n in R such that m is divisible by p is an ideal. 

4. Let R be a ring and J J 2 left ideals. Show that J 1 nJ 2 is a left ideal, and 
similarly for right and two-sided ideals. 

5. Let R be a ring and a e R. Let J be the set of all x e R such that xa = 0. 
Show that J is a left ideal. 

6. Let R be a ring and L a left ideal. Let M be the set of all xeR such that 
xL = 0 (i.e. xy = 0 for all yeL). Show that M is a two-sided ideal. 

7. Let R be a commutative ring. Let L, M be ideals. 

(a) Show that LM cl L n M. 

(b) Given an example when LM ^ L n M. 

Note that as a result of (a), if J is an ideal of R, then we get a sequence of 
ideals contained in each other by taking powers of J, namely 

J ZD J 2 ZD J 3 ZD - - ^ r ZD ... 

8. The following example will be of interest in calculus. Let R be the ring of in- 
finitely differentiable functions defined, say, on the open interval — 1 < t < 1. 
Let J n be the set of functions feR such that D k f( 0) = 0 for all integers k with 
0 ^ k ^ n. Here D denotes the derivative, so J n is the set of functions all of 
whose derivatives up to order n vanish at 0. Show that J n is an ideal in R. 

9. Let R be the ring of real-valued functions on the interval [0, 1]. Let S be a 
subset of this interval. Show that the set of all functions feR such that 
f(x) = 0 for all xeS is an ideal of R. 

Note : If you know about matrices and linear maps then you should do im- 
mediately the exercises of Chapter V, §3, and you should look at those of 
Chapter V, §4. Of course, if necessary, do them after you have read the required 
material. But they give examples of rings and ideals. 
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III, §3. HOMOMORPHISMS 

Let R , R ' be rings. By a ring-homomorphism f:R-+R', we shall mean a 
mapping having the following properties: For all x, yeR, 

f(x + y)= f(x) + f(y), f(xy) = f(x)f(y), f(e) = e' 

(if e , e' are the unit elements of R and R' respectively). 

By the kernel of a ring-homomorphism f:R^>R', we shall mean its 
kernel viewed as a homomorphism of additive groups, i.e. it is the set of 
all elements xeR such that /(x) = 0. Exercise : Prove that the kernel is 
a two-sided ideal of R. 

Example 1. Let R be the ring of complex-valued functions on the in- 
terval [0, 1]. The map which to each function f eR associates its value 
is a ring-homomorphism of R into C. 

Example 2. Let R be the ring of real-valued functions on the interval 
[0, 1]. Let R' be the ring of real-valued functions on the interval [0, \~\. 
Each function feR can be viewed as a function on [0, \~], and when we 
so view /, we call it the restriction of / to [0, j]. More generally, let S 
be a set, and S' a subset. Let R be the ring of real- valued functions on 
S. For each feR, we denote by f\S' the function on S' whose value at 
an element xeS' is /(x). Then f\S' is called the restriction of / to S'. 
Let R' be the ring of real-valued functions on S'. Then the map 

is a ring-homomorphism of R into R'. 

Since the kernel of a ring-homomorphism is defined only in terms of 
the additive groups involved, we know that a ring-homomorphism whose 
kernel is trivial is injective. 

Let f.R^R' be a ring-homomorphism. If there exists a ring-homo- 
morphism g: R'^> R such that g of and /° § are the respective identity 
mappings, then we say that / is a ring-isomorphism. A ring-isomorphism 
of a ring with itself is called an automorphism. As with groups, we have 
the following properties. 

If f.R^R' is a ring-homomorphism which is a bijection, then f is a 
ring-isomorphism. 

Furthermore, if f:R^> R' and g: R' -* R" are ring-homomorphisms, then 
the composite g °/: R -> R" is also a ring-homomorphism. 

We leave both proofs to the reader. 
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Remark. We have so far met with group homomorphisms and ring 
homomorphisms, and we have defined the notion of isomorphism in a 
similar way in each one of these categories of objects. Our definitions 
have followed a completely general pattern, which can be applied to 
other objects and categories (for instance modules, which we shall meet 
later). In general, not to prejudice what kind of objects one deals with, 
one may use the word morphism instead of homomorphism. Then an 
isomorphism (in whatever category) is a morphism / for which there exists 
a morphism g satisfying 

f u 9 = id and g»f = id. 

In other words, it is a morphism which has an inverse. 

The symbol id denotes the identity. For this general definition to 
make sense, very few properties of morphisms need be satisfied: only 
associativity, and the existence of an identity for each object. We don’t 
want to go fully into this abstraction now, but watch for it as you 
encounter the pattern latter. An automorphism is then defined as an 
isomorphism of an object with itself. It then follows completely generally 
directly from the definition that the automorphisms of an object form a 
group. One of the basic topics of study of mathematics is the structure of 
the groups of automorphisms of various objects. For instance, in Chapter 
II, as an exercise you determined the group of automorphisms of a cyclic 
group. In Galois theory, you will determine the automorphisms of 
certain fields. 

We shall now define a notion similar to that of factor group, but ap- 
plied to rings. 

Let R be a ring and M a two-sided ideal. If x, yeR, define x congru- 
ent to y mod M to mean x — yeM. We write this relation in the form 

x = y (mod M). 

It is then very simple to prove the following statements. 

(a) We have x = x (mod M). 

(b) If x = y and y = z (mod M), then x = z (mod M). 

(c) If x = y then y = x (mod M). 

(d) If x = y (mod M), and ze R, then xz = yz (mod M), and also 
zx = zy (mod M). 

(e) If x = y and x' = / (mod M), then xx' = yy' (mod M). Further- 
more, x + x' = y + y (mod M). 

The proofs of the preceding assertions are all trivial. As an example, 
we shall give the proof of the first part of (e). The hypothesis means 
that we can write 

x = y + z and x' = / + z' 
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with z, z' e M. Then 

xx' = (y + z)(y' + z') » yy' + zy' + yz' + zz'. 

Since M is a two-sided ideal, each one of zy', yz\ zz' lies in M, and con- 
sequently their sum lies in M. Hence xx' = yy' (mod M), as was to be 
shown. 

Remark. The present notion of congruence generalizes the notion 
defined for the integers in Chapter I. Indeed, if R = Z, then the 
congruence 

x = y (mod n) 

in Chapter I, meaning that x — y is divisible by n , is equivalent to the 
property that x — y lies in the ideal generated by n. 

If xeR, we let x be the set of all elements of R which are congruent 
to x (mod M). Recalling the definition of a factor group, we see that x is 
none other than the additive coset x + M of x, relative to M. Any ele- 
ment of that coset (also called congruence class of x mod M) is called a 
representative of the coset. 

We let R be the set of all congruence classes of R mod M. In other 
words, we let R = R/M be the additive factor group of R modulo M. 
Then we already know that R is an additive group. We shall now define 
a multiplication which will make R into a ring. 

If x and y are additive cosets of M, we define their product to be the 
coset of xy, i.e. to be xy. Using condition (e) above, we see that this 
coset is independent of the selected representatives x in x and y in y. 
Thus our multiplication is well defined by the rule 

(x + M)(y + M) = (xy + M). 

It is now a simple matter to verify that the axioms of a ring are sat- 
isfied. RI 1 is already known since R/M is taken as the factor group. 
For RI 2, let x, y, z be congruence classes, with representatives x, y, z 
respectively in R. Then y + z is a representative of y + z by definition, 
and x(y + z) is a representative of x(y + z). But x(y + z) = xy + xz. 
Furthermore, xy is a representative of xy and xz is a representative of 
xz. Hence by definition, 


x(y + z) = xy + xz. 

Similarly, one proves RI 3. As for RI 4, if e denotes the unit element of 
R, then e is a unit element in R, because ex = x is a representative of ex. 
This proves all the axioms. 
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We call R = R/M the factor ring of R modulo M. 

We observe that the map f:R->R/M such that /(x) = x is a ring- 
homomorphism of R onto R/M , whose kernel is M. The verification is 
immediate, and essentially amounts to the definition of the addition 
and multiplication of cosets of M. 

Theorem 3.1. Let f:R—>S be a ring-homomorphism and let M be its 
kernel. For each coset C of M the image /(C) is an element of S, and 
the association 

f'Cv-* f{C) 

is a ring-isomorphism of R/M onto the image of f. 

Proof The fact that the image of / is a subring of S is left as an exer- 
cise (Exercise 1). Each coset C consists of all elements x + z with some 
x and all zeM. Thus 


/(x + z) = f(x) + f{z) = f(x) 

implies that /(C) consists of one element. Thus we get a map 

/ : C i— > /(C) 

as asserted. If x, y represent cosets of M, then the relations 

f(*y) = fWf(y\ 

/(x + 30 =/(x)+/() 0 , 

f( e Ri) = e s 

show that / is a homomorphism of R/M into S. If xeR/M is such that 
/(x) = 0, this means that for any representative x of x we have /(x) = 0, 
whence xeM and x = 0 (in R/M). Thus /is injective. This proves what 
we wanted. 

Example 3. If R = Z, and n is a non-zero integer, then R/(n) = Z/(n) 
is called the ring of integers modulo n. We note that this is a finite ring, 
having exactly n elements. (Proof?) We also write Z/nZ instead of 

z /in). 

Example 4. Let R be any ring, with unit element e. Let ae R. Since 
R is also an additive abelian group, we know how to define na for 
every integer n. If n is positive, then 


na = a + a + --- + a 9 
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the sum being taken n times. If n = —k where k is positive, so n is 
negative, then 

na — —(ka). 


In particular, we can take a — e, so we get a map 

/: Z -► R such that n f— ► ne. 

As in Example 4 of Chapter II, §3 we know that this map / is a 
homomorphism of additive abelian groups. But / is also a ring 
homomorphism. Indeed, first note the property that for every positive 
integer n, 


( ne)a = (e + • • • + e)a = ea + • • • + ea = n(ea) = a + • • • + a = na. 

n times 


If m, n are positive integers, then 

m(na) = na + -- - + wa = a + -- - + a + -- -+ a + -- - + a = a + -- -4-a 
m times n times n times times 

s V ' 

m times 

= (mn)a. 


Hence putting a = e, we get 

f(mn) = (mn)e = m(ne) = (me)(ne) = /(m)/(n). 

We leave it to the reader to verify the verify the similar property when m 
or n is negative. The proof uses the case when m, n are positive, together 
with the property of homomorphisms that f(—ri) = —f(n). 

Let /: Z -> R be a ring homomorphism. By definition we must have 
/( 1) = e. Hence necessarily for every positive integer n we must have 


/(n) =/(!+••• +!)=/(!) + • ••+/(!) = ««, 


and for a negative integer m——k , 

/(-*)= -/(fc)= -(M. 

Thus there is one and only one ring homomorphism of Z into a ring 
K, which is the one we defined above. 

Assume R^{ 0}. Let f.Z^R be the ring homomorphism. Then 
the kernel of / is not all of Z and hence is an ideal nZ for some integer 
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n ^ 0. It follows from Theorem 3.1 that Z/nZ is isomorphic to the image 
of /. In practice, we do not make any distinction between Z/nZ and its 
image in R , and we agree to say that R contains Z/nZ as a subring. 
Suppose that n ^ 0. Then we have relation 

na = 0 for all aeR. 

Indeed, na = ( ne)a = 0a = 0. Sometimes one says that R has character- 
istic n. Thus if n is the characteristic of R , then na = 0 for all aeR. 

Theorem 3.2. Suppose that R is an integral ring , so has no divisors of 

0. Then the integer n such that Z/nZ is contained in R must be 0 or a 

prime number . 

Proof Suppose n is not 0 and is not prime. Then n = mk with 
integers m, k ^ 2, and neither m, k are in the kernel of the homomor- 
phism /: Z -> R. Hence me ^ 0 and ke ^ 0. But (me)(ke) = mke = 0, 
contradicting the hypothesis that R has no divisors of 0. Hence n is 
prime. 

Let K be a field and let f\Z^K be the homomorphism of the 
integers into K. If the kernel of / is {0}, then K contains Z as a subring, 
and we say that K has characteristic 0. If the kernel of / is generated by 
a prime number p, then we say that K has characteristic p. The field 
Z/pZ is sometimes denoted by F p , and is called the prime field, of 
characteristic p . This prime field F p is contained in every field of 
characteristic p . 

Let R be a ring. Recall that a unit in R is an element ue R which 
has a multiplicative inverse, that is there exists an element v e R such 
that uv = e. The set of units is denoted by R*. This set of units is a 
group. Indeed, if u i9 u 2 are units, then the product u x u 2 is a unit, because 
it has the inverse u 2 l u x l . The rest of the group axioms are immediately 
verified from the ring axioms concerning multiplication. 

Example. Let n be an integer ^ 2 and let R = ZlnZ. Then the units 
of R consist of those elements of R which have a representative a e Z 
which is prime to n . (Do Exercise 3.) This group of units is especially 
important, and we shall now describe how it occurs as a group of 
automorphisms. 

Theorem 3.3. Let G be a cyclic group of order N, written multi- 

plicatively. Let meZ be relatively prime to N, and let 

a m :G^>G 
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be the map such that cr m (x) = x m . Then o m is an automorphism of G, and 

the association 

m i — ► o m 

induces an isomorphism (Z/ N Z)* Aut(G). 

Proof Since o m (xy ) = ( xy) m = x m y m (because G is commutative), it 
follows that cr m is a homomorphism of G into itself. Since (m, N) = 1, we 
conclude that x m = e => x = e. Hence ker(cr m ) is trivial, and since G is 
finite, it follows that o m is bijective, so o m is an automorphism. If 
m = n mod N then o m = o n so cr m depends only on the coset of 
m mod N Z. We have 

Vmn(*) = X mn = (XT = O n °m(x) 

so m h->cr m induces a homomorphism of (Z/NZ)* into Aut(G). Let a be 
a generator of G. If o m = id, then a m — a, whence a m ~ i =e and 
N\(m— 1), so we 1 mod N. Hence the kernel of mi— >o m in ( Z/NZ )* is 
trivial. Finally, let /: G G be an automorphism. Then f(a) = a k for 
some k e Z because a is a generator. Since / is an automorphism, we 
must have (fc, N) = 1, otherwise a k is not a generator of G. But then for 
all x g G, x = a 1 (i depends on x), we get 

f(a i )=f(a) i = a ki = (a% 

so / = o k . Hence the injective homomorphism (Z/NZ)* -> Aut(G) given 
by mi— ►cr m is surjective, and is therefore an isomorphism. QED. 

Let R be a commutative ring. Let P be an ideal. We define P to be 
a prime ideal if P ^ R and whenever a, be R and abeP then a e P or 
be P. In Exercise 1 7 you will prove that an ideal of Z is prime if and 
only if this ideal is 0, or is generated by a prime number. 

Let R be a commutative ring. Let M be an ideal. We define M to be 
maximal ideal if M ^ R and if there is no ideal J such that R J => M 
and R ^ J, J ^ M . 

You should do Exercises 17, 18, and 19 to get acquainted with prime 
and maximal ideals. These exercises will prove: 

Theorem 3.4. Let R be a commutative ring. 

(a) A maximal ideal is prime. 

(b) An ideal P is prime if and only if R/P is integral. 

(c) An ideal M is maximal i f and only if R/M is a field. 

To do these exercises, you may want to use the following fact: 
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Let M be a maximal ideal and let x e R, x$ M. Then 


M + Rx = R. 


Indeed, M 4- Rx is an ideal ^ M, so M + Rx must be R since M is 
assumed maximal. 

Ill, §3. EXERCISES 

1. Let / : R -* R f be a ring-homomorphism. Show that the image of / is a sub- 
ring of R f . 

2. Show that a ring-homomorphism of a field K into a ring R ^ {0} is an 
isomorphism of K onto its image. 

3. (a) Let n be a positive integer, and let Z n = ZjnZ be the factor ring of Z 

modulo n. Show that the units of Z n are precisely those residue classes x 
having a representative integer x ^ 0 and relatively prime to n . (For the 
definition of unit, see the end of §1.) 

(b) Let x be an integer relatively prime to n. Let cp be the Euler function. 
Show that x* (M) = 1 (mod n). 

4. (a) Let n be an integer ^ 2. Show that Z/nZ is an integral ring if and 

only if n is prime. 

(b) Let p be a prime number. Show that in the ring Z/(p), every non-zero 
element has a multiplicative inverse, and that the non-zero elements form 
a multiplicative group. 

(c) If a is an integer, a ^ 0(mod p), show that a p ~ 1 = 1 (mod p). 

5. (a) Let R be a ring, and let x, yeR be such that xy = yx. What is (x + y) n ? 

(Cf. Exercise 2 of Chapter I, §2.) 

(b) Recall that an element x is called nilpotent if there exists a positive integer 
n such that x" =0. If R is commutative and x, y are nilpotent, show that 
x + y is nilpotent. 

6. Let f be a finite field having q elements. Prove that x q ~ l = 1 for every 

nonzero element xe F. Show that x q = x for every element x of F. 

7. Chinese Remainder Theorem. Let R be a commutative ring and let J u J 2 be 

ideals. They are called relatively prime if 

J 1 + J 2 — R- 

Suppose J 1 and J 2 are relatively prime. Given elements a, beR show that 
there exists xe R such that 

x = a (mod JJ and x = b (mod J 2 ). 

[This result applies in particular when R = Z, J l = (mj and J 2 = (m 2 ) with 
relatively prime integers m u m 2 .] 
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8. If J u J 2 are relatively prime, show that for every positive integer n, J” and J n 2 
are relatively prime. 

9. Let R be a ring, and M, M f two-sided ideals. Assume that M contains Af'. If 

xeR, denote its residue class mod M by x(M). Show that there is a (unique) 

ring-homomorphism R/M' -► R/M which maps x(M') on x(M). 

Example. If n, m are integers ^ 0, such that n divides m, apply Exercise 9 to 
get a ring-homomorphism Z/(m) -► Z/(n). 

10. Let R, R' be rings. Let R x R' be the set of all pairs (x, x') with xeR and 
x' e R'. Show how one can make R x R’ into a ring, by defining addition and 
multiplication componentwise. In particular, what is the unit element of 

R x R'l 

1 1. Let R, R u . ..,R n be rings and let /: R -> R 1 x ■ • ■ x R n . Show that / is a ring 

homomorphism if and only if each coordinate map f t : R -► R ( is a ring 

homomorphism. 

12. (a) Let J l5 J 2 be relatively prime ideals in a commutative ring R . Show that 

the map a mod J y n J 2 y-+(a mod J h a mod J 2 ) induces an isomorphism 

f :R/(J 1 nJ 2 )^R/J 1 x R/J 2 . 

(b) Again, if J h J 2 are relatively prime, show that J 1 n J 2 = J X J 2 > 

Example. If m, n are relatively prime integers, then (m) n (n) = (mn). 

(c) If J h J 2 are not relatively prime, give an example to show that one does 
not necessarily have J 1 nJ 2 = J l J 2 . 

(d) In (a), show that / induces an isomorphism of the unit groups 

WJ1J2)* MWi)* x (R/h)*- 

(e) Let J y, ... J r be ideals of R such that J { is relatively prime to J k for i # k. 
Show that there is a natural ring isomorphism 


r/Ji - j r ->n Wi- 


tt. Let P be the set of positive integers and R the set of functions defined on P, 
with values in a commutative ring K. Define the sum in R to be the ordinary 
addition of functions, and define the product by the formula 

(/ * 9)(m) = I f(x)9(y), 

xy=m 


where the sum is taken over all pairs (x, y) of positive integers such that 
xy = m. This sum can also be written in the form 

(/ * 9)(m) = X f(d)g(m/d) 

d | m 

where the sum is taken over all positive divisors of m, including 1 of course, 
(a) Show that R is a commutative ring, whose unit element is the function d 
such that (5(1) = 1 and d(x) = 0 if x ^ 1. 
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(b) A function / is said to be multiplicative if f(mn) = whenever m, n 

are relatively prime. If f g are multiplicative, show that f * g is 
multiplicative. 

(c) Let p be the Moebius function such that /i(l) = 1, g(p 1 • ■ ■ p r ) = ( — l) r if 
p l5 ...,p r are distinct primes, and / u(m) = 0 if m is divisible by p 2 for some 
prime p. Show that / u*(p 1 =S (where (p 1 denotes the constant function 
having value 1). [Hint: Show first that p is multiplicative and then prove 
the assertion for prime powers.] The Mobius inversion formula of 
elementary number theory is then nothing else but the relation 


n*<p l *f = f 


In other words, if for some function g we have 


fin) = 'Zd(d) = (<Pi *g)(n) 

d | n 

then 

g(n) = {n* f)(n) = £ n(d)f(n/d). 

d\n 


The product f * g in this exercise is called the convolution product. Note how 
formalizing this product and viewing functions as elements of a ring under the 
convolution product simplifies the inversion formalism with the Moebius 
function. 

14. Let f:R->R f be a ring-homomorphism. Let J' be a two-sided ideal of R\ 
and let J be the set of elements x of R such that f(x) lies in J'. Show that J 
is a two-sided ideal of R. 

15. Let R be a commutative ring, and N the set of elements xeR such that 

x n = 0 for some positive integer n. Show that N is an ideal. 

16. In Exercise 15, if x is an element of R/N, and if there exists an integer n ^ 1 

such that x" = 0, show that x = 0. 

17. Let R be a commutative ring. An ideal P is said to be a prime ideal if 

P ^ R, and whenever a, be R and ab e P then aeP or beP. Show that a 

non-zero ideal of Z is prime if and only if it is generated by a prime number. 

18. Let R be a commutative ring. An ideal M of R is said to be a maximal 

ideal if M ^ R, and if there is no ideal J such that R □ J □ M, and R ^ J, 

J ^ M. Show that every maximal ideal is prime. 

19. Let R be a commutative ring. 

(a) Show that an ideal P is prime if and only if R/P is integral. 

(b) Show that an ideal M is maximal if and only if R/M is a field. 

20. Let K be a field of characteristic p. Show that (x + y) p = x p + y p for all x, 

yeK. 

21. Let K be a finite field of characteristic p. Show that the map xi— >x p is an 
automorphism of K. 
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22. Let S be a set, X a subset, and assume neither S nor X is empty. Let R be a 
ring. Let F(S, R) be the ring of all mappings of S into R , and let 

p: F(S, R) -> F(X, R) 

be the restriction, i.e. if / e F(S , R ), then p(f) is just / viewed as a 
map of X into R. Show that p is surjective. Describe the kernel of p. 

23. Let AC be a field and S a set. Let x 0 be an element of S. Let F(S, AC) be the 
ring of mappings of S into AC, and let J be the set of maps / e F(S, AC) such 
that f(x 0 ) = 0. Show that J is a maximal ideal. Show that F(S, K)/J is 
isomorphic to AC. 

24. Let R be a commutative ring. A map D:R->R is called a derivation if 
D(x + y) = Dx + Dy , and D(xy) = (Dx)y + x(Dy) for all x, yeR. If D 1# D 2 
are derivations, define the bracket product 

[D 1 ,D 2 ]=D 1 oD 2 -D 2 oD 1 . 

Show that [D 1? D 2 ] is a derivation. 

Example. Let R be the ring of infinitely differentiable real-valued func- 
tions of, say, two real variables. Any differential operator 

i i 

or <f (x,y) - 

dx iy 

with coefficients /, § which are infinitely differentiable functions, is a deriva- 
tion on R. 


Ill, §4. QUOTIENT FIELDS 

In the preceding sections, we have assumed that the reader is acquainted 
with the rational numbers, in order to give examples for more abstract 
concepts. We shall now study how one can define the rationals from the 
integers. Furthermore, in the next chapter, we shall study polynomials 
over a field. One is accustomed to form quotients f/g(g ^ 0) of poly- 
nomials, and such quotients are called rational functions. Our discussion 
will apply to this situation also. 

Before giving the abstract discussion, we analyze the case of the 
rational numbers more closely. In elementary school, what is done 
(or what should be done), is to give rules for determining when two 
quotients of rational numbers are equal. This is needed, because, for 
instance, f = f . The point is that a fraction is determined by a pair of 
numbers, in this special example (3, 4), but also by other pairs, e.g. (6, 8). 
If we view all pairs giving rise to the same quotient as equivalent, then 
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we get our cue how to define the fraction, namely as a certain equiva- 
lence class of pairs. Next, one must give rules for adding fractions, and 
the rules we shall give in general are precisely the same as those which 
are (or should be) given in elementary school. 

Our discussion will apply to an arbitrary integral ring R. (Recall 
that integral means that 1 ^ 0, that R is commutative and without 
divisors of 0.) 

Let (a, b ) and (c, d) be pairs of elements in R, with b # 0 and d # 0. 
We shall say that these pairs are equivalent if ad = be. We contend that 
this is an equivalence relation. Going back to the definition of Chapter 
I, §5, we see that ER 1 and ER 3 are obvious. As for ER 2, suppose 
that ( a , b) is equivalent to (c, d) and (c, d) is equivalent to (e, /). By 
definition, 

ad = be and cf = de. 

Multiplying the first equality by / and the second by b , we obtain 
adf = bef and bef = bde , 

whence adf = bde , and daf — dbe = 0. Then d(af — be) = 0. Since R has 
no divisors of 0, it follows that af — be = 0, i.e. af = be. This means that 
(a, b) is equivalent to (e, /), and proves ER 2. 

We denote the equivalence class of ( a , b) by a/b. We must now define 
how to add and multiply such classes. 

If a/b and c/d are such classes, we define their sum to be 

a c ad 4- be 

b + d bd 


and their product to be 

a c ac 
bd = bd' 

We must show of course that in defining the sum and product as above, 
the result is independent of the choice of pairs ( a , b) and (c, d) represent- 
ing the given classes. We shall do this for the sum. Suppose that 

a/b = a! /b f and c/d — c'/d'. 


ad + be a'd' + b'c' 

bd = b'dT 


We must show that 
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This is true if and only if 

b f d'(ad + be) = bd(a'd' + b'c '), 


or in other words 


(i) 


b'd'ad + b'd'bc = bda'd' + bdb'c '. 


But afr' = a'fr and cd' = c'd by assumption. Using this, we see at once 
that (1) holds. We leave the analogous statement for the product as an 
exercise. 

We now contend that the set of all quotients a/b with b ^ 0 is a ring, 
the operations of addition and multiplication being defined as above. 
Note first that there is a unit element, namely 1/1, where 1 is the unit 
element of R. One must now verify all the other axioms of a ring. This 
is tedious, but obvious at each step. As an example, we shall prove the 
associativity of addition. For three quotients a/b , c/d, and e/f we have 


a c 
b + d 


e ad + be e fad + fbc + bde 

+ 1 = bd + 7 = mT 


On the other hand, 


a / c e\ a cf + de adf + bef + bde 

b + \d + f) = b + df = W~ 


It is then clear that the expressions on the right-hand sides of these 
equations are equal, thereby proving associativity of addition. The other 
axioms are equally easy to prove, and we shall omit this tedious routine. 
We note that our ring of quotients is commutative. 

Let us denote the ring of all quotients a/b by K. We contend that K 
is a field. To see this, all we need to do is prove that every non-zero 
element has a multiplicative inverse. But the zero element of K is 0/1, 
and if a/b = 0/1 then a = 0. Hence any non-zero element can be written 
in the form a/b with b ^ 0 and a ^ 0. Its inverse is then b/a , as one sees 
directly from the definition of multiplication of quotients. 

Finally, observe that we have a natural map of R into K , namely the 
map 

a i— ► a/\. 


It is again routine to verify that this map is an injective ring-homo- 
morphism. Any injective ring-homomorphism will be called an embed- 
ding. We see that R is embedded in K in a natural way. 

We call K the quotient field of R. When R = Z, then K is by defini- 
tion the field of rational numbers. When R is the ring of polynomials 
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defined in the next chapter, its quotient field is called the field of rational 
functions. 

Suppose that R is a subring of a field F. The set of all elements ab~ l 
with a, be R and b ^ 0 is easily seen to form a field, which is a subfield 
of F. We also call this field the quotient field of R in F. There can be 
no confusion with this terminology, because the quotient field of R as 
defined previously is isomorphic to this subfield, under the map 

a/b i— ► ab~ l . 

The verification is trivial, and in view of this, the element ab~ l of F is 
also denoted by a/b. 

Example. Let K be a field and as usual, Q the rational numbers. 
There does not necessarily exist an embedding of Q into K (for instance, 
K may be finite). However, if an embedding of Q into K exists, there is 
only one. This is easily seen, because any homomorphism 

/: Q -> K 

must be such that /( \) = e (unit element of K). Then for any integer 
n > 0 one sees by induction that f(n) = ne , and consequently 

f(-n ) = -ne. 

Furthermore, 


e =/(!)=/(»« l ) = /(«)/(« *) 

so that f(n~ 1 )=f(n) l = (ne)~ i . Thus for any quotient m/n = mn~ l 
with integers m, n and n > 0 we must have 

f(m/n) = (me)(ne )~ 1 

thus showing that / is uniquely determined. It is then customary to 
identify Q inside K and view every rational number as an element of K. 

Finally, we make some remarks on the extension of an embedding of 
a ring into a field. 

Let R be an integral ring, and 


f.R^E 

an embedding of R into some field E. Let K be the quotient field of R. 
Then f admits a unique extension to an embedding of K into E, that is 
an embedding /*: K -> E whose restriction to R is equal to f. 
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To see the uniqueness, observe that if /* is an extension of /, and 

f*:K^E 

is an embedding, then for all a , beR we must have 
f%a/b)=f*(a)/f*(b)=f(a)/f(b\ 

so the effect of /* on K is determined by the effect of / on R. Con- 
versely, one can define f* by the formula 

f*(a/b)=m/m , 

and it is seen at once that the value of /* is independent of the choice of 
the representation of the quotient ajb , that is if a/b = c/d with 

a, b, c, d e R and bd ^ 0, 


then 


m/m=f(c)/m. 

One also verifies routinely that /* so defined is a homomorphism, there- 
by proving the existence. 


Ill, §4. EXERCISES 

1. Put in all details in the proof of the existence of the extension /* at the end 
of this section. 

2. A (ring-) isomorphism of a ring onto itself is also called an automorphism. Let 
R be an integral ring, and a : R -> R an automorphism of R. Show that <r 
admits a unique extension to an automorphism of the quotient field. 


CHAPTER IV 


Polynomials 


IV, §1. POLYNOMIALS AND POLYNOMIAL FUNCTIONS 

Let K be a field. Every reader of this book will have written expressions 
like 

a nt n + 1 + **• + ^ 0 ? 

where a 0 , . . . ,a n are real or complex numbers. We could also take these 
to be elements of K. But what does “t” mean? Or powers of “f” like t , 
t 2 t nr > 

In elementary courses, when K = R or C then we speak of polynomial 
functions. We write 


/(0 — a n tn “• + a o 

to mean the function of K into itself such that for each element teK the 
value of the function / is f(t\ the value given by the above expression. 

But in operating with polynomials, we usually work formally, without 
worrying about / being a function. For instance, let a 0 ,.. ,,a n be ele- 
ments of K and b 0 ,...,b m also be elements of K. We just write expres- 
sions like 

/( 0 = a n t n + + 0 O , 

g(t) = b m t m + --- + 6 0 . 

If, say, n > m we let bj = 0 if j > m and we also write 


0(O = O*" + .-. + fr m * m + ..- + fc o > 
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and we write the sum formally as 


(*) 


(/ + 0)(O = (a n + b n )t n + ■ ■ ■ + (a 0 + b 0 ). 


If ceK then we write 


(c/)(0 = ca n t n + ■■■ + ca 0 . 


We can also take the product which we write as 


(fg)(t) = (a n b rn y +m + -- + a 0 b 0 


In fact, if we write 



then 


k 


(**) 


c k ~ Z a i^k-i ~ a ob k + ci l b k _ l + •'• + a kbo- 


This expression for c k simply comes from collecting all the terms 


in the product which will give rise to the term involving t k . 

All that matters is that we defined a rule for the addition and 
multiplication of the above expressions according to formulas (*) and (**). 
Furthermore, it does not matter either that the coefficients a h bj are in a 
field. The only properties we need about such coefficients is that they 
satisfy the ordinary properties of arithmetic, or in other words that they 
lie in a commutative ring. The only thing we still have to clear up is the 
role of the letter “r”, selected arbitrarily. So we must use some device to 
define polynomials, and especially a “variable” t. There are several 
possible devices, and one of them runs as follows. 

Let R be a commutative ring. Let Pol^ be the set of infinite vectors 


with a n e R and such that all but a finite number of a n are equal to 0. 
Thus a vector looks like 


with zeros all the way to the right. Elements of Pol^ are called 
polynomials over R. The elements a 0 , a l5 ... are called the coefficients of 
the polynomial. The zero polynomial is the polynomial (0, 0, . . .) which 
has a { = 0 for all i. We define addition of infinite vectors componentwise, 



(a 0 , a t , a 2 , . . . , a n , . . .) 


(a 0 , a u ...,a d , 0, 0, 0, ...) 
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just as for finite ^-tuples. Then Pol^ is an additive group. We define 
multiplication to mimic the multiplication we know already. If 

/= (a 0 ,a u ...) and g = (b 0 , b u . ..) 

are polynomials with coefficients in R , we define their product to be 

k 

fg = (c 0 ,c u ...) with c*=X a A-; = X ^bj. 

i - 0 i + j-k 

It is then a routine matter to prove that under this definition of 
multiplication Pol^ is a commutative ring. We shall prove associativity 
of multiplication and leave the other axioms as exercises. 

Let 

h = ( d 0 , di , . . .) 

be a polynomial. Then 

(fd)h = (e 0 , e L , . . .) 

where by definition 

e s = X C k d r = X ( X a i b i) d r 

k + r = s k+r = s\i + j = k / 

= X Qibjd r . 

i+j+r=s 

This last sum is taken over all triples (/, y , r) of integers ^ 0 such that 
/ T j + r = s. If we now compute f(gh) in a similar way, we find exactly 
the same coefficients for (fg)h as for f(gh), thereby proving associativity. 
We leave the proofs of the other properties to the reader. 

Now pick a letter, for instance t , to denote 

t = (0, 1, 0, 0, 0, ...). 

Thus t has coefficient 0 in the 0-th place, coefficient 1 in the first place, 
and all other coefficients equal to 0. Having our ring structure, we may 
now take powers of t , for instance 

t, t 2 , t 3 , 

By induction of whatever means you want, you will prove immediately 
that if n is a positive integer, then 


t n = ( 0, 0,..., 0,1,0, 0,...), 
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in other words t n is the vector having the n - th component equal to 1, and 
all other components equal to 0. 

The association 


a (a, 0, 0, . . .) for aeR 

is an embedding of R in the polynomial ring Pol^, in other words it is an 
injective ring homomorphism. We shall identify a and the vector 
(a, 0, 0, 0, . . .). Then we can multiply a polynomial f = (a 09 a l9 ... 9 ) by an 
element of R componentwise, that is 


af — (aa 0 , aa^ aa 2 , . . .)• 

This corresponds to the ordinary multiplication of a polynomial by a 
scalar. 

Observe that we can now write 


/= a 0 + a x t + ••• + a d t d = (a 0 ,a u ...,a d , 0,0,0,...) 

if a n = 0 for n > d. This is the more usual way of writing a polynomial. 
So we have recovered all the basic properties concerning addition and 
multiplication of polynomials. The polynomial ring will be denoted by 
R[tl 

Let R be a subring of a commutative ring S. If f e K[r] is a 
polynomial, then we may define the associated polynomial function 

fs-S^S 


by letting for xe S 


fs( x ) = f(x) = a 0 + a x x + ■ ■ ■ + a d x d . 

Therefore f s is a function (mapping) of S into itself, determined by the 
polynomial f Given an element ceS, directly from the definition of 
multiplication of polynomials, we find: 

The association 


ev t : / !-»■ f(c) 

is a ring homomorphism of K[t] into S. 
This property simply says that 


(/ + g)(c) = f(c) -f g(c) and ( fg)(c ) = f(c)g(c). 
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Also the polynomial 1 maps to the unit element 1 of S. This homomor- 
phism is called the evaluation homomorphism, and is denoted by ev c for 
obvious reasons. You are used to evaluating polynomials at numbers, 
and all we have done is to point out that the whole procedure of 
evaluating polynomials is applicable to a much more general context over 
commutative rings. The element t in the polynomial ring is also called a 
variable over K. The evaluation /(c) of / at c is also said to be obtained 
by substitution of c in the polynomial. 

Note that / = /(f), according to our definition of the evaluation 
homomorphism. 

Let K be a subring of a ring S and let x be an element of S . We 
denote by R[x~\ the set of all elements f(x) with fe R[t]. Then it is 
immediately verified that i?[x] is a commutative subring of S , which is 
said to be generated by x over R. The evaluation homomorphism 

RIG -> Rlx] 

is then a ring homomorphism onto /?[x]. If the evaluation map 
/i— >/(x) gives an isomorphism of R[t] with K[x], then we say that x is 
transcendental over R or that x is a variable over R. 

Example. Let a = yjl. Then the set of all real numbers of the form 
a + bu with a, be Z 

is a subring of the real numbers, generated by yjl. This is the subring 
Z[>/2]- (Prove in detail that it is a subring.) Note that a is not 
transcendental over Z. For instance, the polynomial t 2 — 2 lies in the 
kernel of the evaluation map f(t) i— > /(Jl ) . 

Example. The polynomial ring R[t~\ is generated by the variable t over 
R , and t is transcendental over R. 

If x, y are transcendental over R , then /?[x], and K[y] are isomorphic, 
since they are both isomorphic to the polynomial ring R[t ~ |. Thus our 
definition of polynomials was merely a concrete way of dealing with a 
ring generated over R by a transcendental element. 

Warning. When we speak of a polynomial, we always mean a 
polynomial as defined above. If we mean the associated polynomial 
function, we shall say so explicitly. In some cases, it may be that two 
polynomials can be distinct, but give rise to the same polynomial 
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function on a given ring. For example, let ¥ p = ZIpT be the field with p 
elements. Then for every element xeF p we have 

x p = x . 

Indeed, if x = 0 this is obvious, and if x # 0, since the multiplicative 
group of F p has p — 1 elements, we get x p ~ l = 1. It follows that x p = x. 
Thus we see that if we let K = F and we let 

/ = t p and g = t 

then f K = g K but / # g. In our original notation, 

/=( o, 0, . . . ,0, 1, 0, 0, ...) and g = ( 0, 1, 0, 0,0, ...). 

If K is an infinite field, then this phenomenon cannot happen, as we 
shall prove below. Most of our work will be over infinite fields, but finite 
fields are sufficiently important so that we have taken care of their 
possibilities from the beginning. Suppose that F is a finite subfield of an 
infinite field K. Let f(t) and g(t)eF[t~\. Then it may happen that f ^ g, 
Ik ^ Qki t> ut If = Qf- For instance, the polynomials t p and t give rise to 
the same function on Z/pZ, but to different functions on any infinite field 
K containing Z/pZ according to what we shall prove below. 

We now return to a general field K. 

When we write a polynomial 

f(t) = a n t n + ••• + a 0 

with a ( e K for i = 0, ...,n then these elements of K are called the coeffi- 
cients of the polynomial f . If n is the largest integer such that a n ^ 0, 
then we say that n is the degree of / and write n = deg /. We also say 
that a n is the leading coefficient of /. We say that a 0 is the constant term 
of f. 

Example. Let _ 

f(t) = It 5 - 8 1 3 + 4 1- Jl. 

Then / has degree 5. The leading coefficient is 7, and the constant term 
is -sfl. 

If / is the zero polynomial then we shall use the convention that 
deg / = — oo. We agree to the convention that 

-oo H oo = — oo, 

— oo + a = — oo, — co < a 

for every integer a , and no other operation with — oo is defined. 
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A polynomial of degree 1 is also called a linear polynomial. 

Let a be an element of K. We shall say that a is a root of f if 

/(«)’= o. 

Theorem LI. Let f be a polynomial in K , written in the form 


f(t) =a„t n + ■■■ + a 0 . 


and suppose f has degree n^. 0, so f / 0 (/ is not the zero polynomial). 
Then f has at most n roots in K. 

Proof We shall need a lemma. 

Lemma 1.2. Let f be a polynomial over K , and let oleK. Then there 
exist elements c 0 , . . . e K such that 


/( 0 — c o + c i (t — a) + • • • + c n (t — a)”. 


Proof We write t = a + (t — a), and substitute this value for t in the 
expression of /. For each integer k with 1 ^ k ^ n, we have 

t k = (a + (f - a)) fc = a fc + • • • + (f - a) fc 

(the expansion being that obtained with the binomial coefficients), and 
therefore 


a k t k = a k a k + • • • + a k (t — a) fc 

can be written as a sum of powers of ( t — a), multiplied by elements of 
K. Taking the sum of a k t k for k = 0,...,n we find the desired expression 
for /, and prove the lemma. 

Observe that in the lemma, we have /(a) = c 0 . Hence, if /( a) = 0, 
then c 0 = 0, and we can write 

fit) = (r - a )h(t\ 

where we can write 

h(t) = d x + d 2 {t — a) + • • • + d n (t — a)" - 1 

for some elements d l9 d, 2 , ... ,d n in K. Suppose that / has more than n 
roots in K , and say a l5 ... ,a„ + 1 are n + 1 distinct roots in K. Let a = a x . 
Then ^ 0 for i = 2, . . . ,n + 1. Since 


0 =/(«/) = (a, - a i )*(«/)> 
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we conclude that /i(a f ) = 0 for i = 2, . . . ,n + 1. By induction on n we 
now see that this is impossible, thereby proving that / has at most n 
roots in K. 

Corollary 1.3. Let f(t) = a n f + • • • + a 0 and g(t) = b n t n + • • • + b 0 . 
Suppose that K is an infinite field. If /(c) = g(c) for all ceK then 
f = g, that is a k = b k for all k = 0, . . . ,n. 


Proof Consider the polynomial 

fit) - git) = ia„ - b n )t" + - ■ ■ + (a 0 - b 0 ). 

Every element of K is a root of this polynomial. Hence by Theorem 1.1, 
we must have a i — b i = 0 for i = 0, ...,w, in other words, a { = b h thereby 
proving the corollary. 

Corollary 1.3 shows that over an infinite field, a polynomial is no dif- 
ferent from a polynomial function. For what we do in this chapter, how- 
ever, most results are true working formally with polynomials. So we do 
not necessarily assume that the base field is infinite. 

Our convention on the degree of a polynomial was also useful to make 
the following result true without exception. 

Theorem 1.4. Let f g be polynomials with coefficients in K. Then 
deg ifg) = deg / + deg g. 


Proof Let 


f{t) = a n t n + • • • + a 0 and g(t) = b m t m -f • • • + b 0 

with a n =£ 0 and b m / 0. Then from the multiplication rule for fg, we see 
that 


f(t)g(t) = a n b m t n + m + terms of lower degree, 

and a n b m # 0. Hence deg fg = n + m = deg / -f deg g. If / or g is 0, then 
our convention about — oo makes our assertion also come out. 

Corollary 1.5. The ring K[t] has no divisors of zero , and is therefore an 
integral ring. 

Proof If /, g are non-zero polynomials, then deg / and deg g are ^ 0, 
whence deg (fg) ^ 0, and fg / 0, as was to be shown. 


[IV, §1] POLYNOMIALS AND POLYNOMIAL FUNCTIONS 113 

In light of Corollary 1.5, we may form the quotient field of the poly- 
nomial ring K[t\. This quotient field is denoted by K(t) and is called the 
field of rational functions. Its elements consist of quotients 

At)/ git) 

where ]\ g are polynomials. More precisely, the elements of K(t) are 
equivalence classes of such quotients, where 

f/9 = fi/di if and only if fg, = gfv 

This relation is merely the relation of elementary school arithmetic, as we 
have seen in Chapter III, §4. 

The next theorem is the Euclidean algorithm or long division, taught 
in elementary school. It is the analogue of the Euclidean algorithm for 
integers. 

Theorem 1.6. Let /, g be polynomials over the field K , i.e. polynomials 
in K[t~\, and assume deg g ^ 0. Then there exist polynomials q , r in 
K[t] such that 

fit) = g (t)g(t) + r(t), 

and deg r < deg g. The polynomials q , r are uniquely determined by 
these conditions. 

Proof. Let m = deg g ^ 0. Write 


fit) = a n t n + ••• + a 0 , 

git) = b m t m + ■•■ + b 0 . 


with b m / 0. If n < m, let q = 0, r = /. If n ^ m, let 
fft)=fit)-a n b- m l t n - m git). 

(This is the first step in the process of long division.) Then 

deg /, < deg /. 

Continuing in this way, or more formally by induction on n, we can find 
polynomials q u r such that 


/i = <fi g + r, 
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with deg /• < deg g. Then 

f(t)=a n b- 1 t"- m g(t)+M 0 

= a„b m 't n ~ m g(t) + q { (t)g(t) + r(t) 

= ( a n b~ 1 t"- m + qjgit) + r(t), 

and we have consequently expressed our polynomial in the desired form. 
To prove the uniqueness, suppose that 

f = <h9 + ri =q 2 9 + r 2 , 

with deg )\ < deg g and deg r 2 < deg g. Then 

(q 1 - 9z)9 = r 2 - r v 

Either the degree of the left-hand side is ^ deg g , or the left-hand side is 
equal to 0. Either the degree of the right-hand side is < deg g, or the 
right-hand side is equal to 0. Hence the only possibility is that they are 
both 0, whence 

q l =q 2 and r 1 =r 2 , 


as was to be shown. 

From the Euclidean algorithm, we can reprove a fact already proved 
by other means. 

Corollary 1.7. Let f be a non-zero polynomial in Let oleK be 

such that /(a) = 0. Then there exists a polynomial q(t) in K[r] such 
that 

f{t) = (t - x)q(t). 


Proof. We can write 

/( 0 = q(t){t - a) + r{t), 

where deg r < deg(t — a). But deg(t — a) = 1 . Hence r is constant. Since 
0 = /(a) - < 7 (a)(a - a) + r(a) = r(a), 
it follows that r = 0, as desired. 

Corollary 1.8. Let K be a field such that every non-constant polynomial 
in K[r] has a root in K. Let f be such a polynomial. Then there exist 
elements a l5 . . .,a„e K and ceK such that 


f(t) = c(t — a t ) • • • (t — a„). 
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Proof. In Corollary 1.7, observe that deg q — deg / — 1. Let ol — oq in 
Corollary 1.7. By assumption, if q is not constant, we can find a root a 2 
of q, and thus write 


/(0 = <Z 2 (0(f - «lX* - «2)- 

Proceeding inductively, we keep on going until q n is constant. 

A field K having the property stated in Corollary 1.8, that every non- 
constant polynomial over K has a root in K , is called algebraically 
closed. We shall prove later in the book that the complex numbers are 
algebraically closed. We shall also prove later that a finite field is not 
algebraically closed. You may assume this from now on. 

We now reprove Theorem 1.1 using the Euclidean algorithm. 

Corollary 1.9. Let K be a field and f a polynomial of degree n ^ 1. 
Then f has at most n roots in K. 

Proof Let a l5 ...,a r be distinct roots of / in K. Then by the Eucli- 
dean algorithm we know there is a factorization 


fit) = c(t - a j) - (r — y-r)g(t), 


so r ^ n, as was to be shown. 

Example. Let F be a finite field, say F = Z/pZ where p is a prime 
number. The polynomial 

Kt) = t?- 1 

is equal to (t — l) p and therefore has only one root, namely 1. 

Suppose F has characteristic p. If p = 2 then the polynomial t 2 — 1, 
which is (t — l) 2 , has only one root, namely 1. On the other hand, if 
p # 2 then this polynomial has two distinct roots, 1 and — 1 . In case 
p / 2 we have 1^—1 in F, for otherwise 1 = — 1 implies 1 -hi =2 = 0 
in F, so F would have characteristic 2. 

As an application of Corollary 1.9 we can completely determine the 
structure of finite subgroups of the multiplicative group in a field. 

Theorem 1.10. Let K be a field and let G be a finite subgroup of the 

group of non-zero elements. Then G is cyclic. 

Proof. Here we shall give a proof using the structure theorem for finite 
abelian groups. By that theorem, we know that 

g = n g(p) 

p 
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is the direct product of the subgroups G(p) which consist of elements 
whose period is a power of p. By Exercise 18 of Chapter II, §1 it will 
suffice to prove that each G(p) is cyclic. If G(p) is not cyclic, then by the 
structure theorem, Theorem 7.2 of Chapter II, G(p) contains a product 
H l x H 2 where H x is cyclic of order p r and H 2 is cyclic of order p s , with 
r, s ^ 1. Say r ^ s. Then every element of G(p) lying in the product of 
these two factors satisfies the equation 

t p> - 1 = 0. 

This equation has at most p r roots, but there are more than p r elements 
in the product of the two factors (there are in fact p r p s = p r + s elements in 
this product). This contradiction proves the theorem. 

Remark. If you don't like the use of the structure theorem for abelian 
groups, then give an independent proof of just those properties which are 
needed here. Such a proof will be given in Chapter VIII, §3. On the 
other hand, you could also use Exercise 2(b) of Chapter II, §7, which can 
be proved directly more easily than the structure theorem. 


We denote by p„ the group of / 2 -th roots of unity. This is the set of 
elements £ such that £" = 1. Strictly speaking, we should denote by \i n {K) 
the group of n - th roots of unity in K, but we often omit the K when the 
reference to the field is clear by the context. Suppose that K has 
characteristic p. Then 


= I- 

Indeed suppose £ p = 1. Then — 1=0. But 

c p - 1 = (c - 1 y = o, 

so £—1=0 and £ = 1. We shall see in §3 that if p does not divide n, 
then \i n has order n, and is thus a cyclic group of order n. A generator 
for p„ is called a primitive /i-th root of unity. In the complex numbers, 
\i n = p„(C) is the ordinary group of n- th roots of unity, generated by e 2nl,n . 
The primitive n- th roots of unity are e 2nir,n with r prime to n. 

Consider the field F — ZjpZ. An integer a e Z such that its image in 
ZpZ is a generator of F* is called a primitive root mod p. Thus the 
period of a mod p is p — 1. Artin conjectured that there are infinitely 
primes p such that, for instance, 2 is a primitive root mod p. Cf„ the 
introduction to his collected works. The answer is still not known. 

For further remarks on Theorem 1.10 in connection with finite fields, 
and especially Z/pZ, see Chapter VIII, §3. 
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IV, §1. EXERCISES 

1. In each of the following cases, write / = qg + r with deg r < deg g. 

(a) /(f) = f 2 - 2f + 1, cj(t) = t - 1 

(b) /(f) = f 3 + f - 1, git) = t 2 + 1 

(c) /(f) = r 3 + f, g(t) = t 

(d) /(f) = t J - 1, cj(t) = f - 1 

2. If f(t) has integer coefficients and if g(t) has integer coefficients and leading 
coefficient 1, show that when we express / = qcj + r with deg r < deg g, the 
polynomials q and r also have integer coefficients. 

3. Using the intermediate value theorem of calculus, show that every polynomial 
of odd degree over the real numbers has a root in the real numbers. 

4. Let f(t) = f" + * • ■ + « 0 be a polynomial with complex coefficients, of degree rc, 
and let a be a root. Show that |a| ^ n ■ max* |«,-|. [Hint: Note = 1. Write 

-a" = a„_ 4- ••• + a 0 . 


If |a| > wmaXflflil, divide by a” and take the absolute value, together with a 
simple estimate, to get a contradiction.] 

In Exercises 5 and 6 you may assume that the roots of a polynomial in an 
algebraically closed field are uniquely determined up to a permutation. 

5. Let f(t) = t 3 — 1. Show that the three roots of / in the complex numbers are 

| £,2*1/3 e - 2*i/3 

Express these roots as « + 6 V —3 where a , b are rational numbers. 

6. Let n be an integer ^ 2. How would you describe the roots of the 
polynomial f(t) = t” — 1 in the complex numbers? 

7. Let F be a field and let a: F[f] -> F[f] be an automorphism of the 
polynomial ring such that a restricts to the identity on F. Show that there 
exists elements F, « / 0, and be F such that at = tit + 6. 


Finite fields 

8. Let F be a finite field. Let c be the product of all non-zero elements of F. 
Show that c = — 1 . 

Example. Let F = Z/pZ. Then the result of Exercise 8 can also be stated in 
the form 

(p — 1)! = — 1 (mod p), 


which is known as Wilson’s theorem. 
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9. Let p be a prime of the form p = An + \ where n is a positive integer. Prove 
that the congruence 


x 2 = — 1 (mod p) 


has a solution in Z. 

10. Let K be a finite field with q elements. Prove that x q = x for all xeK. 
Hence the polynomials t q and t give rise to the same function on K. 

11. Let K be a finite field with q elements. If f g are polynomials over K of 
degrees < q, and if f(x) = g(x) for all xeK, prove that / = g (as polynomials 

in KM)- 

12. Let K be a finite field with q elements. Let / be a polynomial over K. Show 
that there exists a polynomial /* over K of degree < q such that 

/*(*)=/(*) 


for all x e K. 

13. Let K be a finite field with q elements. Let <jeK. Show that there exists a 
polynomial / over K such that f(a) = 0 and f(x) = 1 for xeK, x / a. [Hint: 
(t-d) q ~\-] 

14. Let K be a finite field with q elements. Let cieK. Show that there exists a 
polynomial / over K such that f(a) = 1 and f(x) = 0 for all xeK, x / a. 

15. Let K be a finite field with q elements. Let <p: K K be any function of K 
into itself. Show that there exists a polynomial / over K such that 
(p{x) = f(x) f or all xeK. 

[For a continuation of these ideas in connection with polynomials in 

several variables, see Exercise 6 of §7.] 


IV, §2. GREATEST COMMON DIVISOR 

Having the Euclidean algorithm, we may now develop the theory of 
divisibility exactly as for the integers, in Chapter I. 

Theorem 2.1. Let J be an ideal of K[t\. Then there exists a polynomial 
g which is a generator of J. If J is not the zero ideal , and g is a 
polynomial in J which is not 0, and is of smallest degree , then g is a 
generator of J. 

Proof Suppose that J is not the zero ideal. Let g be a polynomial in 
J which is not 0, and is of smallest degree. We assert that g is a genera- 
tor for J. Let / be any element of J. By the Euclidean algorithm, we 
can find polynomials q , r such that 


f = qg + r 
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with deg r < deg g. Then r = / — qg , and by the definition of an ideal, it 
follows that r also lies in J. Since deg r < deg g , we must have r = 0. 
Hence / = qg , and g is a generator for J, as desired. 

Remark. Let g l be a non-zero generator for an ideal J, and let 
g 2 also be a generator. Then there exists a polynomial q such that 
9 1 = <702- Since 


deg ^ = deg 0 + deg 0 2 , 

it follows that deg g 2 rg deg 0j. By symmetry, we must have 

deg 0 2 = deg0 t . 

Hence g is constant. We can write 

01 = C 02 


with some constant c. Write 

0 2 (O = a nt n + 

with / 0. Take b = a~ 1 . Then bg 2 is also a generator of J, and its 
leading coefficient is equal to 1. Thus we can always find a generator for 
an ideal (/0) whose leading coefficient is 1. It is furthermore clear that 
this generator is uniquely determined. 

Let /, g be non-zero polynomials. We shall say that g divides /, and 
write g\f, if there exists a polynomial q such that / = gq. Let f l9 f 2 be 
polynomials / 0. By a greatest common divisor of / l5 f 2 we shall mean 
a polynomial g such that g divides f x and f 2 , and furthermore, if h 
divides f i and / 2 , then h divides g. 

Theorem 2.2. Let f h f 2 be non-zero polynomials in K\_Q. Let g be a 
generator for the ideal generated by f u f 2 . Then g is a greatest com- 
mon divisor of f i and f 2 . 

Proof Since f 1 lies in the ideal generated by f u / 2 , there exists a 
polynomial q r such that 


/i = 9i9 > 

whence g divides / x . Similarly, g divides / 2 . Let h be a polynomial 
dividing both f 1 and f 2 . Write 


f l =h l h and f 2 = h 2 h 
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with some polynomials and h 2 . Since g is in the ideal generated by 
/ l5 f 2 , there are polynomials g u g 2 such that g = g l f 1 + g 2 f 2 , whence 

9 = 0iM + 02 h 2 h = (01*1 + 92 h l) h - 
Consequently h divides g , and our theorem is proved. 

Remark 1. The greatest common divisor is determined up to a non- 
zero constant multiple. If we select a greatest common divisor with lead- 
ing coefficient 1, then it is uniquely determined. 

Remark 2. Exactly the same proof applies when we have more than 
two polynomials. For instance, if / l5 ...,/ w are non-zero polynomials, 
and if g is a generator for the ideal generated by f l9 . then g is a 
greatest common divisor of / ls . . . 9 f n . 

Polynomials fi , whose greatest common divisor is 1 are said to 
be relatively prime. 


IV, §2. EXERCISES 

1. Show that t n — 1 is divisible by t — 1. 

2. Show that t 4 + 4 can be factored as a product of polynomials of degree 2 
with integer coefficients. [Hint: try t 2 ± 2f + 2.] 

3. If n is odd, find the quotient of t n + 1 by t + 1. 


IV, §3. UNIQUE FACTORIZATION 

A polynomial p in K[t] will be said to be irreducible (over K) if it is of 
degree ^ 1, and if, given a factorization p = fg with /, geK[t\ then 
deg / or deg g = 0 (i.e. one of /, g is constant). Thus, up to a non-zero 
constant factor, the only divisors of p are p itself, and 1. 

Example 1. The only irreducible polynomials over the complex 
numbers are the polynomials of degree 1, i.e. non-zero constant multiples 
of polynomials of type t — a, with aeC. 

Example 2. The polynomial t 2 - b 1 is irreducible over R. 

Theorem 3.1. Every polynomial in K[t] of degree ^ 1 can be expressed 
as a product Pi---p m of irreducible polynomials. In such a product , the 
polynomials p l3 ...,p m are uniquely determined , up to a rearrangement , 
and up to non-zero constant factors. 
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Proof. We first prove the existence of the factorization into a product 
of irreducible polynomials. Let / be in /C[t], of degree ^ 1. If / is irre- 
ducible, we are done. Otherwise, we can write 

/ = gK 

where deg g < deg / and deg h < deg /. By induction we can express g 
and h as products of irreducible polynomials, and hence / = gh can also 
be expressed as such a product. 

We must now prove uniqueness. We need a lemma. 

Lemma 3.2. Let p be irreducible in K[tf Let /, geK[t] be non-zero 
polynomials , and assume p divides fg. Then p divides f or p divides g. 

Proof Asume that p does not divide /. Then the greatest common 
divisor of p and / is 1, and there exist polynomials h u h 2 in K[t] such 
that 

l = h 1 p + h 2 f. 

(We use Theorem 2.2.) Multiplying by g yields 


g = ghtf + h 2 fg. 


But fg = ph 3 for some /i 3 , whence 


9 = (9h i + h 2 h s )P> 


and p divides g, as was to be shown. 

The lemma will be applied when p divides a product of irreducible 
polynomials q 1 -- q s . In that case, p divides q : or p divides q 2 ’--Qs- 
Hence there exists a constant c such that p = cq u or p divides q 2 ’-q s . 
In the latter case, we can proceed inductively, and we conclude that in 
any case, there exists some i such that p and q t differ by a constant 
factor. 

Suppose now that we have two products of irreducible polynomials 

Pi'-Pr = <h--q s - 

After renumbering the q h we may assume that p 1 = c l q l for some con- 
stant c v Cancelling q 1 , we obtain 


ClPl-'-Pr = <h -Qs> 
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Repeating our argument inductively, we conclude that there exist con- 
stants c f such that p t = for all i, after making a possible permutation 
of q l9 ,..,q s . This proves the desired uniqueness. 


Corollary 3.3. Let f be a polynomial in K[f\ of degree ^ 1. Then f 
has a factorization f = cp x • • • p s , where p u . . . ,p s are irreducible polyno- 
mials with leading coefficient 1, uniquely determined up to a permutation. 


Corollary 3.4. Let K be algebraically closed. Let f be a polynomial in 
K[t\ of degree ^ 1. Then f has a factorization 

f(t) = c(t - - <x n ), 

with a i e K and ceK. The factors t — a,- are uniquely determined up to 
a permutation. 


We shall deal mostly with polynomials having leading coefficient 1. 
Let / be such a polynomial of degree ^ 1. Let p l9 ...,p r be the distinct 
irreducible polynomials with leading coefficient 1 occurring in its factori- 
zation. Then we can express / as a product 

f=Pi'-tf r > 

where m 1 ,...,m r are positive integers, uniquely determined by p l ,...,p r . 
This factorization will be called a normalized factorization for /. In parti- 
cular, over an algebraically closed field, we can write 

A polynomial with leading coefficient 1 is sometimes called monic. 

If p is irreducible, and / = p m g , where p does not divide g , and m is an 
integer ^ 0, then we say that m is the multiplicity of p in /. (We define 
p° to be 1.) We denote this multiplicity by ord p f, and also call it the 
order of / at p. 

If a is a root of /, and 


m = (t- cc) m g(t\ 

with g( a) # 0, then t — a does not divide g(t\ and m is the multiplicity of 
t — a in / We also say that m is the multiplicity of a in / A root of / is 
said to be simple if its multiplicity is 1. A root is said to be multiple if 
m > 1. 

There is an easy test for m > 1 in terms of the derivative. 
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Let /(f) = a n t n + ••• + a 0 be a polynomial. Define its (formal) deriva- 
tive to be 

Df(t) = f'(t) = naj"- 1 + (n - 1 )a„. 1 t H ~ 2 + ••• + = £ ka k t k ~\ 

k= 1 

Then we have the following properties, just as in ordinary calculus. 
Proposition 3.5. Let /, g be polynomials. Let ceK. Then 

(c/y = cf. 


(/ + 9)’ =f' + 9', 


(M =fg'+f'9- 

Proof. The first two properties ( cf )' = cf' and (/ 4- g)' = f + g' are 
immediate from the definition. As to the third, namely the rule for 
the derivative of a product, suppose we know this rule for the product 
f x g and f 2 g where / l9 / 2 , g are polynomials. Then we deduce it for 
(/i +fi)g as follows: 

((/i +/ 2 )^)' = (Ag + fig)’ = (ft 9)’ + (/ 2 0)' 

= fi9'+f\9 + f 2 9 +f'i9 
-(.A + fi)g' + (/'. +/' 2 )». 

Similarly, if we know the rule for the derivative of the products /#! and 
/g 2 then this rule holds for the derivative of f{g x + g 2 ). Therefore it suf- 
fices to prove the rule for the derivative of a product when /(f) and g(t) 
are monomials, that is 

f(t) = at n and g(t) = bt m 

with a , beK. But then 

(at n bt m ) f = (abt n+m y = (n + m)abt n + m ~ 1 

= nat n ~ x bt m + flf"mbf m_1 

= f'(t)g(t) + f(t)g\t). 

This concludes the proof. 

As an exercise, prove by induction: 

// /(f) = h(t) m for some integer m ^ 1, f/ien 

/'(t) = 
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Theorem 3.6. Let K be a field. Let f be a polynomial over K , of de- 
gree ^ 1, and let a be a root of f in K. Then the multiplicity of a in f 
is > 1 if and only if /'(a) = 0. 

Proof Suppose that 

fit) = (t - ot) m g(t) 

with m > 1. Taking the derivative, we find 

AO = m(t - ot) m ~ l g(t) + (t- oi) m g\t). 

Substituting a shows that /'(a) = 0 because m — 1 ^ 1. Conversely, 
suppose 

fit) = (t- ot) m g(t\ 

and g(oi) ^ 0, so that m is the multiplicity of a in /. If m = 1 then 

AO = Git) + it - ot)g'it), 

so that /'(a) = gici) # 0. This proves our theorem. 

Example. Let K be a field in which the polynomial t n — 1 factorizes 
into factors of degree 1, that is 

f - 1 = ri('-w- 

i — 1 

The roots of t n — 1 constitute the group of n- th roots of unity p„. 
Suppose that the characteristic of K is 0, or is p and p f n. Then we 
claim that these n roots £i> ...,£„ are distinct, so the group p„ has order n. 
Indeed, let f(t) = t n — 1. Then 


fit) = nt n -\ 

and if ( e then /'(C) = n^ n ~ 1 #0 because n # 0 in K. This proves that 
every root occurs with multiplicity 1, whence that the n roots are distinct. 
By Theorem 1.10 we know that p„ is cyclic, and we have now found that 
p„ is cyclic of order n when p f n. The equation t n — 1 = 0 is called the 
cyclotomic equation. We shall study it especially in Exercise 13, Chapter 
VII, §6 and Chapter VIII, §5. 


IV, §3. EXERCISES 

1. Let / be a polynomial of degree 2 over a field K. Show that either / is 
irreducible over K, or / has a factorization into linear factors over K. 
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2. (a) Let / be a polynomial of degree 3 over a field K. If / is not irreducible 

over K, show that / has a root in K. 

(b) Let F = Z/2Z. Show that the polynomial f 3 + t 2 + 1 is irreducible in 
Fltl 

3. Let f(t) be an irreducible polynomial with leading coefficient 1 over the real 
numbers. Assume deg / = 2. Show that /(f) can be written in the form 

f(t) = (t~ a) 2 + b 2 

with some a, b e R and b # 0. Conversely, prove that any such polynomial is 
irreducible over R. 

4. Let g\ K — > L be an isomorphism of fields. If /(f) = £ a t t l is a polynomial 

over K, define of to be the polynomial £ in L[f]. 

(a) Prove that the association f\-+af\s an isomorphism of K[f] onto L[f]. 

(b) Let oc e K be a root of / of multiplicity m. Prove that ool is a root of of 
also of multiplicity tn [Hint: Use unique factorization.] 

Example for Exercise 4. Let K = C be the complex numbers, and let a 
be conjugation, so a: C -> C maps oc i— If / is a polynomial with 
complex coefficients, say 


/(*) = V" + ■■■ + «(>. 


Then its complex conjugate 

/(f) = % n t n + ■ • - + y o 

is obtained by taking the complex conjugate of each coefficient. If /, g are 
in C[f], then 

(J + 9) = f + (fy) = fg, 

and if /? e C, then (/?/) = /?/. 

5. (a) Let /(r) be a polynomial with real coefficients. Let a be a root of / which 

is complex but not real. Show that y is also a root of f. 

(b) Assume that the complex numbers are algebraically closed. Let /(f)eR[f] 
be a polynomial with real coefficients, of degree g 1. Assume that / is 
irreducible. Prove that deg / = 1 or deg / = 2. It follows that if a real 
polynomial is expressed as a product of irreducible factors, then these 
factors have degree 1 or 2. 

6. Let K be a field which is a subfield of a field E. Let f(t)eK[ f] be an 
irreducible polynomial, let g(t) e K[f] be any polynomial / 0, and let a e E be 
an element such that / (oc) = g( a) — 0. In other words, /, g have a common 
root in E. Prove that /(f) 1 0(f) in K[f]. 

7. (a) Let K be a field of characteristic 0, subfield of a field E. Let oc e £ be a 

root of /(f) 6 K[r]. Prove that a has multiplicity m if and only if 

f (k \y) = 0 for k = 1, . . . ,m — 1 but f (m \y) # 0. 


(As usual, f (k) denotes the k - th derivative of f) 
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(b) Show that the assertion of (a) is not generally true if K has characteris- 
tic p. 

(c) In fact, what is the value of / (m) (a)? There is a very simple expression for 
it. 

(d) If K has characteristic 0 and if f(t) is irreducible in K\_tf prove that a has 
multiplicity 1. 

(e) Suppose K has characteristic p and p f n. Let fit) be irreducible in K[f], 
of degree n. Let a be a root of f Proof that a has multiplicity 1. 

8. Show that the following polynomials have no multiple roots in C: 

(a) t 4 +t (b) t 5 - 5t + 1 

(c) any polynomial t 2 + bt + c if b, c are numbers such that b 2 — 4c is not 0. 

9. (a) Let K be a subfield of a field £, and clgE. Let J be the set of all 

polynomials f(t) in K(T] such that /(a) = 0. Show that J is an ideal. If J 
is not the zero ideal, show that the monic generator of J is irreducible. 

(b) Conversely, let p(t) be irreducible in K[f] and let a be a root. Show that 
the ideal of polynomials /(f) in K[f] such that /(a) = 0 is the ideal 
generated by p(f). 

10. Let f g be two polynomials, written in the form 

/ = p V • • • p\ r 

and 

g = 

where i v , j v are integers ^ 0, and p u ...,p r are distinct irreducible polyno- 
mials. 

(a) Show that the greatest common divisor of / and g can be expressed as a 
product p k i'--p k r r where k { ,...,k r are integers ^ 0. Express k v in terms of 
K and j v . 

(b) Define the least common multiple of polynomials, and express the least 
common multiple of / and g as a product p\ l • • • p k r r with integers k v > 0. 
Express k v in terms of i v and j v . 

1 1. Give the greatest common divisor and least common multiple of the follow- 
ing pairs of polynomials with complex coefficients: 

(a) (r - 2 ) 3 (t - 3) 4 (t - 0 and (t - l)(f - 2)(t - 3) 3 

(b) (t 2 + 1 )(t 2 - 1) and (t + i) 3 (t 3 - 1) 

12. Let K be a field, R = K[t] the ring of polynomials, and F the quotient field 
of R , i.e. the field of rational functions. Let a gK. Let R a be the set of 
rational functions which can be written as a quotient f /g of polynomials 
such that g(oi) ± 0. Show that R x is a ring. If (p is a rational function, and 
(p = f/g such that g( a) ^ 0, define <p(a) = f \ cl)/ g (a). Show that this value <p(a ) 
is independent of the choice of representation of cp as a quotient f/g. Show 
that the map (p \-* <p(a) is a ring-homomorphism of R a into K. Show that the 
kernel of this ring-homomorphism consists of all rational functions f/g such 
that g( a) ^ 0 and /(a) = 0. If M a denotes this kernel, show that M a is a 
maximal ideal of 
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13. Let W n be the set of primitive n - th roots of unity in C*. Define the n - th 
cyclotomic polynomial to be 


a>„(t)= n (t-o. 

W n 


(a) Prove that t" - 1 = ffjin ^(0- 

(b) <I>„(r) = Y\d\ n W ld ~ 1 )" W) where /< is the Moebius function. 

(c) Let p be a prime number and let k be a positive integer. Prove 

= tjyr'’" ‘) and <!>,,(() = r p ~‘ + ■■• + 1. 


(d) Compute explicitly <£> n (t) for n ^ 10. 

14. Let R be a rational function over the field /C, and express R as a quotient of 
polynomials, R = g/ /. Define the derivative 

fa' - gf 

R. - fl — 

where the prime means the formal derivative of polynomials as in the text. 

(a) Show that this derivative is independent of the expression of R as a quo- 
tient of polynomials, i.e. if R =g l /f { then 

f«' - of f\j\ - f hfj 

f 2 ' ~ 7? 

(b) Show that the derivative of rational functions satisfies the same rules as 
before, namely for rational functions R l and R 2 we have 

(. R l 7 R 2 y = R\ + R f 2 and (R 1 R 2 )' = + R\R 2 . 

(c) Let 0 Cj,...,a„ and a u ...,ci n be elements of K such that 

1 a , a„ 

- — ~~r r = + ■ ■ ■ + - - • 

(< - oc,)---(t - a„) t-a t t - x„ 


Let f(t) = (t — oc 1 )- ’(t — a n ) and assume that are distinct. Show 

that 

1 _ 1 

1 («! ~ oc 2 ) - (a 1 - a n ) /'(a,) 


15. Show that the map R\-+R'/R is a homomorphism from the multiplicative 
group of non-zero rational functions to the additive group of rational func- 
tions. We call R'/R the logarithmic derivative of R. If 

R{t)= fl (' *,r 

i - 1 


where are integers, what is R'/Rl 
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16. For any polynomial f let n 0 {f ) be the number of distinct roots of / 
when / is factored into factors of degree 1. So if 


m=cf] <f - *r, 

i= 1 

where c # 0 and ... ,a r are distinct, then n 0 (f ) = r. Prove: 

Mason-Stothers Theorem. Let K = C, or more generally an algebraically 
closed field of characteristic 0. Let f g 1 he K[t\ be relatively prime polynomials 
not all constant such that f + g = h. Then 

deg f deg g , deg h ^ n 9 (fgh) - 1 . 

[. Masons proof. You may assume that f,g,h can be factored into products of 
polynomials of degree 1 in some larger field. Divide the relation / + g = h by 
h, so get R + S = 1 where R,S are rational functions. Take the derivative and 
write the resulting relation as 


R' S' 

— R + -S = 0. 

R S 

Solve for S/R in terms of the logarithmic derivatives. Let 

on) = 2 n ^ - p / 1 and Hi) = c 3 n (f - yk ) q *• 

Let D = n(f-*,)n(f-^)n (r — y k ). Use D as a common denominator 
and multiply R'/R and S'/S by D. Then count degrees.] 

17. Assume the preceding exercise. Let /, g e K [r] be non-constant polynomials 
such that /’ 3 — cj 2 ^ 0, and let h = / 3 — g 2 . Prove that 

deg / ^ 2 deg h — 2 and deg # ^ 3 deg /? — 3. 


18. More generally, suppose f m + g n = h ^ 0. Assume mn > m + n. Prove that 


deg / ^ 


deg h. 


mn — (m + n) 

19. Let g , h be polynomials over a field of characteristic 0, such that 


/" + cf = fc" 


and assume £/ relatively prime. Suppose that t/ have degree ^ 1. Show 
that n 2. (This is the Fermat problem for polynomials. Use Exercise 16.) 

For an alternate treatment of the above, see §9. 
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IV, §4. PARTIAL FRACTIONS 

In the preceding section, we proved that a polynomial can be expressed 
as a product of powers of irreducible polynomials in a unique way (up 
to a permutation of the factors). The same is true of a rational function, 
if we allow negative exponents. Let R — g/f be a rational function, 
expressed as a quotient of polynomials g , / with / # 0. Suppose R # 0. 
If g , / are not relatively prime, we may cancel their greatest common 
divisor and thus obtain an expression of R as a quotient of relatively 
prime polynomials. Factoring out their constant leading coellicients, we 
can write 


R = 


c 


0i 


where f u g x have leading coefficient 1. Then f u g l9 and c are uniquely 
determined, for suppose 


<W/i = c 2 g 2 /f 2 

for constants c, c 2 and pairs of relatively prime polynomials f u g x and 
/ 2 , g 2 with leading coefficient 1. Then 


c(hf 2 = c 2 g 2 f 

From the unique factorization of polynomials, we conclude that g { = g 2 
and f i = f 2 so that c = c 2 . 

If we now factorize f i and g l into products of powers of irreducible 
polynomials, we obtain the unique factorization of R. This is entirely 
analogous to the factorization of a rational number obtained in 
Chapter I, §4. 

We wish to decompose a rational function into a sum of rational 
functions, such that the denominator of each term is equal to a power of 
an irreducible polynomial. Such a decomposition is called a partial frac- 
tion decomposition. We begin by a lemma which allows us to apply 
induction. 


Lemma 4.1. Let f u f 2 be non-zero , relatively prime polynomials over a 
field K. Then there exist polynomials h l9 h 2 over K such that 

1 h x h 2 

Jj2 = h + J 2 
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Proof. Since f u f 2 are relatively prime, there exist polynomials h u h 2 
such that 

*2/1 + h Ji = i- 

Dividing both sides by /i/ 2 , we obtain what we want. 

Theorem 4.2. Every rational function R can be written in the form 

h\ h n 

Pi Pn 

where p l9 ..-,p„ are distinct irreducible polynomials with leading coeffi- 
cient 1; i„ are integers ^ 0; h u ... 9 h n , h are polynomials , satisfying 

deg /i v < deg p l v v and p v f h v 

for v= 1, ...,n. In such an expression , r/ze integers i v and r/ze polyno- 
mials /z v , h (v = l,...,w) are uniquely determined when all i v > 0. 

Proof We first prove the existence of the expression described in our 
theorem. Let K = #// where / is a non-zero polynomial, with <7, / rela- 
tively prime, and write 

f=p i l l -T i n, 

where p 1? . . . are distinct irreducible polynomials, and i l5 . . . ,i„ are 
integers ^ 0. By the lemma, there exist polynomials g u g* such that 

1 = £±,_ 9 * 

/ pV p^-pIt’ 

and by induction, there exist polynomials g 2 ,--- 9 g n such that 




Gn 


P*2 ' • ' Pn Pi Pn 


Multiplying by g , we obtain 


G^GGi GGn 

f A rt' 

By the Euclidean algorithm, we can divide gg v by p[ v for v = 1, . . . ,n 
letting 

99v = 9vPv + K, deg /i v < deg p\\ 

In this way obtain the desired expression for g/f, with h = 9\ H + <?«• 
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Next we prove the uniqueness. Suppose we have expressions 


hi K /q 

~ + • ' • + ( + h — —■ h 
Pl Pn P’1 


+ 



satisfying the conditions stated in the theorem. (We can assume that the 
irreducible polynomials p u . ..,p„ are the same on both sides, letting some 
i v be equal to 0 if necessary.) Then there exist polynomials <p, \j/ such 
that \j/ ^ 0 and p x \ \j/, for which we can write 

hi hj_ _ <P 

p‘i P\' 'l> 

Say i l ^ j t . Then 

- h 1 = f 
Pi ^ 

Since \j/ is not divisible by p u it follows from unique factorization that 
p{ divides h l p{~ 1 ' — h v If j Y / then p^h^ contrary to the conditions 
stated in the theorem. Hence j l = i v Again since if/ is not divisible by 
p u it follows now that p\ l divides h 1 — h v By hypothesis, 

deg(hi - hi) < deg p{'. 


Hence h 1 — = 0, whence h x = h 1 . We therefore conclude that 


p i i 




n t 

H — + K 

P J n 


and we can conclude the proof by induction. 


The expression of Theorem 4.2 is called the partial fraction decomposi- 
tion of R. 

The irreducible polynomials p l5 in Theorem 4.2 can be described 
somewhat more precisely, and the next theorem gives additional informa- 
tion on them, and also on h. 


Theorem 4.3. Let the notation be as in Theorem 4.2, and let the 
rational function R be expressed in the form R = g/f where g , / are 
relatively prime polynomials , / ^ 0. Assume that all integers 
are > 0. Then 

f = Pl ■■■ Pn 

is the prime power factorization of /. Furthermore , if deg f > deg g , 
then h = 0. 
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Proof. If we put the partial fraction expression for R in Theorem 4.2 
over a common denominator, we obtain 

, , n h lP2 ■ ■ ■ Pn + • • • + /l„pV • • • Pir-1 + hp 1 ■■■pi? 

(*) R = u r„ 

Pi ■■■Pn 

Then p v does not divide the numerator on the right in (*), for any index 
v = 1, . . . ,n. Indeed, p v divides every term in this numerator except the 
term 

Kp i 1 •••/£• ••/>»' 

(where the roof over pf means that we omit this factor). This comes 
from the hypothesis that p v does not divide h v . Hence the numerator 
and denominator on the right in (*) are relatively prime, thereby proving 
our first assertion. 

As to the second, letting g be the numerator of R and / its denomina- 
tor, we have / = p\ • • • p and 

9 = Rf = Kpi ■■■pi? + --- + Kp'i ■ ■ ■ pi?- i + h p‘i ■ ■ ■ p«- 

Assume that deg g < deg /. Then every term in the preceding sum has 
degree < def /, except possibly the last term 

¥ = hti-'-p 1 ”. 

If h / 0, then this last term has degree ^ deg /, and we then get 

hf=g- h i p*2 ■■•pjr KPi • • ' Pn- u 

where the left-hand side has degree ^ deg / and the right-hand side has 
degree < deg /. This is impossible. Hence h = 0, as was to be shown. 


Remark. Given a rational function R = g/f where g , / are relatively 
prime polynomials, we can use the Euclidean algorithm and write 


9 = 9if+9i , 

where g u g 2 are poloynomials and degg 2 < deg/. Then 


0 

/ 


9i , 

f 019 


and we can apply Theorem 4.3 to the rational function g 2 /j . In studying 
rational functions, it is always useful to perform first this long division to 
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reduce the study of the rational function to the case when the degree of 
the numerator is smaller than the degree of its denominator. 

Example 1. Let a b ,..,a„ be distinct elements of K. Then there exist 
elements a u ...,a n eK such that 

1 = Qi + # _ <** 

(t ~ aj)-* (t - oc n ) t-oc : t-a n 

Indeed, in the present case we can apply Theorems 4.2 and 4.3, with 
<7=1, and hence deg g < deg /. In Exercise 14 of the preceding section, 
we showed how to determine a { in a special way. 

Each expression h v /p\, v in the partial fraction decomposition can be 
further analyzed, by writing h v in a special way, which we now describe. 

Theorem 4.4. Let cp be a non-constant polynomial over the field K. Let 
h be any polynomial over K. Then there exist polynomials 
such that 

fl = '('0 + 'I'l'P + ■•• + I 

and deg < deg cp for all i = 0, . .. ,m. The polynomials i /f 0 , . . . are 
uniquely determined by these conditions. 

Proof We prove the existence of [f/ 0 , . . . ,i// m by induction on the 
degree of h. By the Euclidean algorithm, we can write 

h = q<p + i// 0 

with the polynomials q , \ p 0 , and deg ip 0 < deg cp. Then deg q < deg /i, so 
that by induction we can write 


q = '!' i + ^2 <p + ••• + 4> m <p m 1 


with polynomials such that deg ^ < deg cp . Substituting, we obtain 
h = + t // 2 (p + ••• + i l/ m (p m ~')(p + f 0 - 

which yields the desired expression. 

As for uniqueness, we observe first that in the expression given in the 
theorem, namely 


h = 0 + 'I'lQ + ••• + = to + <P(»Ai + ••• + ') 
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the polynomial i/^ 0 is necessarily the remainder of the division of h by <p, 
so that its uniqueness is given by the Euclidean algorithm. Then, writing 
h = qcp + i// 0 , we conclude that 

q = ^i+--- + i P m <p m ~\ 

and q is uniquely determined. Thus \// u . . . 9 i// m are uniquely determined 
by induction, as was to be proved. 

The expression of h in terms of powers of cp as given in Theorem 4.4 
is called its <p-adic expansion. We can apply this to the case where cp is 
an irreducible polynomial p, in which case this expression is the p-adic 
expansion of h. Suppose that 

h = \J/ 0 + {//tf + ••• + i l/ m p m 

is its p-adic expansion. Then dividing by p l for some integer i > 0, we 
obtain the following theorem. 

Theorem 4.5. Let h be a polynomial and p an irreducible polynomial 
over the field K. Let i be an integer > 0. Then there exists a unique 
expression 


h q i q i-i-i 

+ ■ • • + 0o + 9iP + • • • + 9sP \ 

p p p 

where g tl are polynomials of degree < deg p. 

In Theorem 4.5, we have adjusted the numbering of p„ t -, so 

that it would fit the exponent of p occurring in the denominator. 
Otherwise, except for this numbering, these polynomials g are nothing 
but ij/ 0 , i/'i,... found in the p-adic expansion of h. 


Corollary 4.6. Let oceK , and let h be a polynomial over K. Then 


h(t) a-i , a-i+i , , , . w 

(t -a) 1 ~ (t - a)' + (t -~ar i + + a ° + “ l(t ~ X) + 


where a tl are elements of K , uniquely determined. 


Proof In this case, p(t) = t — a has degree 1, so that the coefficients 
in the p-adic expansion must be constants. 
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Example 2. To determine the partial fraction decomposition of a given 
rational function, one can solve a system of linear equations. We give an 
example. We wish to write 


1 a b 

(t — “l)(r — 2) = t + t -~2 

with constants a and b. Putting the right-hand side over a common de- 
nominator, we have 


1 a(t - 2) + b(t - 1) 

(t- lX*-2)“ (r - l)(r - 2) 

Setting the numerators equal to each other, we must have 

ci + b = 0, 

— 2 a — b = 1. 

We then solve for a and b to get a = — 1 and b = 1. The general case 
can be handled similarly. 


IV, §4. EXERCISES 


1. Determine the partial fraction decomposition of the following rational func- 
tions. 

t + 1 1 

(a) (T- l)(r 4- 2) (b) (t + I )(T+ 2) 

2. Let R = g/f be a rational function with deg g < deg /. Let 


f 



be its partial fraction decomposition. Let d v = deg p v . Show that the coeffi- 
cients of h u ...jh n are the solutions of a system of linear equations, such that 
the number of variables is equal to the number of equations, namely 

deg / = i l d i H h i„d„. 


Theorem 4.3 shows that this system has a unique solution. 

3. Find the (£ — 2)-adic expansion of the following polynomials. 

(a) t 2 - 1 (b) r’ + t - 1 (c) t 3 + 3 (d) t* + 2t 3 - t + 5 

4. Find the (£ — 3)-adic expansion of the polynomials in Exercise 3. 
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IV, §5. POLYNOMIALS OVER RINGS AND 
OVER THE INTEGERS 

Let R be a commutative ring. We formed the polynomial ring over R , 
just as we did over a field, namely R[t~\ consists of all formal sums 

fit) = a n t n + + a 0 

with elements a 0 ,...,a n in R which are called the coefficients of / The 
sum and product are defined just as over a field. The ring R could be 
integral, in which case R has a quotient field K , and K[f] is contained in 
K[t]. In fact, R[t~\ is a subring of 

The advantage of dealing with coefficients in a ring is that we can deal 
with more refined properties of polynomials. Polynomials with coeffi- 
cients in the ring of integers Z form a particularly interesting ring. We 
shall prove some special properties of such polynomials, leading to an 
important criterion for irreducibility of polynomials over the rational 
numbers. Before we do that, we make one more general comment on 
polynomials over rings. 


be a homomorphism of commutative rings. If f(t)eR[t~\ is a polynomial 
over R as above, then we define the polynomial of in S[r] to be the 
polynomial 


(ff/)(t) = a (a„)t n + • • • + a(a 0 ) = £ oUiJt 1 . 

i=0 

Then it is immediately verified that g thereby is a ring homomorphism 

which we also denote by g. Indeed, let 

and g(t)=f i b i t i . 
i=0 i=0 

Then (/ + g)(t) = £ (a, + b,)t‘ so 

C(f + g){t) = X ff («i + = X = vfU) + <70(0- 


k 

C k = X a i^k n 

t = 0 


Also fg(t) = Yj C kt k where 
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SO 

a(.fgM = X X <Kaib k -i)t k = X X <*(aMbi-i)t k 

k i = 0 k i = 0 

= (afMagM 

This proves that o induces a homomorphism R[t~\ -» S[f]. 

Example. Let R = Z and let S = Z/wZ for some integer n ^ 2. For 
each integer a e Z let a denote its residue class mod n. Then the map 

X ^f'^X 

is a ring homomorphism of Z[t] into (ZlnZ)[f\. Note that we don’t need 
to assume that n is prime. This ring homomorphism is called reduction 
mod n. If n = p is a prime number, and we let ¥ p = Z/pZ so ¥ p is a field, 
then we get a homomorphism 


z M - f p m 

where F p [t] is the polynomial ring over the field F p . 

A polynomial over Z will be called primitive if its coefficients are 
relatively prime, that is if there is no prime p which divides all 
coefficients. In particular, a primitive polynomial is not the zero poly- 
nomial. 


Lemma 5.1. Let f be a polynomial # 0 over the rational numbers. Then 
there exists a rational number a # 0 such that af has integer coefficients , 
which are relatively prime , i.e. af is primitive. 


Proof Write 


/(0 = a nt n + ■•■ + a 0 . 


where a 0 , ... ,a n are rational numbers, and a n ^ 0. Let d be a common 
denominator for a 0 ,...,a n . Then d/ has integral coefficients, namely 
da n ,...,d^o- Let b be a greatest common divisor for da n , . . . ,da 0 . Then 


d 

b 


m 


da n dar\ 

T t" + ••• + ,-° 

b b 


has relatively prime integral coefficients, as was to be shown. 


Lemma 5.2 (Gauss). Let f g be primitive polynomials over the integers. 
Then fg is primitive. 
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Proof. Write 


/(0 — * n t n + ’ “ + *0 5 ^ 

g(t) = b m t m + + b 0 > K # 0, 

with relatively prime « 0 ) and relatively prime (b m ,... i b 0 ). Let p be 

a prime. It will suffice to prove that p does not divide every coefficient of 
fg. Let r be the largest integer such that 0 ^ r ^ n, « r ^ 0, and p does 
not divide « r . Similarly, let b s be the coefficient of g farthest to the left, 
b s # 0, such that p does not divide b s . Consider the coefficient of t r+s in 
f(t)g(t). This coefficient is equal to 

c = *Jb s + «r+ A - 1 + 

+ *r- A+l + “• 

and p does not divide * r b s . However, p divides every other non-zero 
term in this sum since each term will be of the form 


#A + s-i 

with to the left of n r , that is i > r, or of the form 


with j > s, that is bj to the left of b s . Hence p does not divide c, and our 
lemma is proved. 

We shall now give a second proof using the idea of reduction modulo 
a prime. Suppose f g are primitive polynomials in Z[t] but fg is not 
primitive. So there exists a prime p which divides all coefficients of fg. 
Let g be reduction mod p. Then af ^ 0 and ag ^ 0 but G(fg) = 0. 
However, the polynomial ring (Z/pZ)[f] has no divisors of zero, so we 
get a contradiction proving the lemma. 

I prefer this second proof for several reasons, but the technique of 
picking coefficients furthest to the left or to the right will reappear in the 
proof of Eisenstein’s criterion, so I gave both proofs. 

Let /(0eZ[t] be a polynomial of degree ^ 1. Suppose that / is 
reducible over Z, that is 


At) = g(t)h(t ), 

where g, h have coefficients in Z and deg g 9 deg h ^ 1. Then of course / 
is reducible over the quotient field Q. Conversely, suppose / is reducible 
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over Q. Is / reducible over Z? The answer is yes, and we prove it in the 
next theorem. 

Theorem 5.3 (Gauss). Let f be a primitive polynomial in Z [r], of degree 
^ 1. If f is reducible over Q, that is if we can write f = gh with g , 
he Q[r], and deg g 1, deg h ^ 1, then f is reducible over Z. More 
precisely , there exist rational numbers a , b such that , if we let g x = ag 
and h x = bh , then g u h x have integer coefficients , and f = g x h x . 

Proof By Lemma 5.1, let a, b be non-zero rational numbers such that 
ag and bh have integer coefficients, relatively prime. Let g x = ag and 
h x = bh. Then 

, i i . 

f -a e 'b h '’ 

whence abf = g l h l . By Lemma 5.2, g x h x has relatively prime integer 
coefficients. Since the coefficients of / are assumed to be relatively prime 
integers, it follows at once that ab itself must be an integer, and cannot 
be divisible by any prime. Hence ab = + 1, and dividing (say) g x by ab 
we obtain what we want. 

Warning. The result of Theorem 5.3 is not generally true for every 
integral ring R. Some restriction on R is needed, like unique factoriza- 
tion. We shall see later how the notion of unique factorization can be 
generalized, as well as its consequences. 

Theorem 5.4 (Eisenstein’s criterion). Let 

f(t) = a n t n + --- + a 0 

be a polynomial of degree n ^ 1 with integer coefficients. Let p be a 
prime , and assume 

a n ^ 0 (mod p), a { = 0 (mod p) for all i < n , 
a 0 ^ 0 (mod p 2 ). 

Then f is irreducible over the rationals. 

Proof. We first divide / by the greatest common divisor of its coeffi- 
cients, and we may then assume that / has relatively prime coefficients. 
By Theorem 5.3, we must show that / cannot be written as a product 
f = gh with g , h having integral coefficients, and deg g, deg h ^ 1. Sup- 
pose this can be done, and write 

g(t) = b d t d + + b 0 , 

h(t ) = c m t m + ••• + c 0 , 
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with d, m ^ 1 and b d c m ^ 0. Since b 0 c 0 = a 0 is divisible by p but not p 2 , 
it follows that one of the numbers b 0 , c 0 is not divisible by p, say b 0 . 
Then p|c 0 . Since c m b d = a n is not divisible by p, it follows that p does 
not divide c m . Let c r be the coefficient of h farthest to the right such 
that c r ^ 0 (mod p). Then r ^ 0 and 

a r — b§c r -f- b^c r -i + • * * 4" b r c 0 . 

Since p\b Q c r but p divides every other term in this sum, we conclude 
that p\a r , a contradiction which proves our theorem. 

Example. The polynomial t 5 — 2 is irreducible over the rational 
numbers, as a direct application of Theorem 5.4. 

Another criterion for irreducibility is given by the next theorem, and 
uses the notion of reduction mod p for some prime p. 


Theorem 5.5 (Reduction criterion). Let f(t)eZ[t~\ be a primitive 
polynomial with leading coefficient a n which is not divisible by a prime 
p. Let Z — ► Z/pZ = F be reduction mod p, and denote the image of f by 
f. If f is irreducible in F[£] ? then f is irreducible in Q[0- 

Proof By Theorem 5.3 it suffices to prove that / does not have a 
factorization f = gh with deg g and deg h ^ 1 and g, he Z [r]. Suppose 
there is such a factorization. Let 

f(t) = a n t n + lower terms, 
g{t) = b r t r + lower terms, 
h(t) = c s t s + lower terms. 

Then a n = b r c s , and since a n is not divisible by p by hypothesis, it follows 
that b r , c s are not divisible by p. Hence 

/ = {ft 

and deg g, deg h ^ 1, which contradicts the hypothesis that / is irre- 
ducible over F. This proves Theorem 5.5. 

Example. The polynomial t 3 — t — 1 is irreducible over Z/3Z, other- 
wise it would have a root which must be 0, 1, or — 1 mod 3. You can see 
that this is not the case by plugging in these three values. Hence 
t 3 — t — 1 is irreducible over Q [r]. 
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In Chapter VII, §3, Exercise 1, you will prove that r 5 — r — 1 is 
irreducible over Z/5Z. It follows that f 5 — f — 1 is irreducible over Q[r]. 
You could already try to prove here that t 5 — t — 1 is irreducible over 
Z/5Z. 


IV, §5. EXERCISES 

1. Integral root theorem. Let f(t) = t n + ••• + a 0 be a polynomial of degree n ^ 1 
with integer coefficients, leading coefficient 1, and a 0 # 0. Show that if / has a 
root in the rational numbers, then this root is in fact an integer, and that this 
integer divides a 0 . 

2. Determine which of the following polynomials are irreducible over the rational 
numbers: 

(a) r 3 — t + 1 (b) r 3 + 2t + 10 (c) r 3 — r — 1 (d) r 3 — 2t 2 + t + 15 

3. Determine which of the following polynomials are irreducible over the rational 
numbers: 

(a) t 4 + 2 (b) t 4 - 2 (c) t 4 + 4 (d) t 4 - t + 1 

4. Let f(t) = a n t n + ■ ■ ■ + a 0 be a polynomial of degree n ^ 1 with integer coeffi- 
cients, assumed relatively prime, and a n a 0 # 0. If b/c is a rational number ex- 
pressed as a quotient of relatively prime integers, b , c # 0, and if f(b/c) = 0, 
show that c divides a n and b divides a 0 . (This result allows us to determine 
effectively all possible rational roots of / since there is only a finite number of 
divisors of a n and a 0 .) 

Remark. The integral root theorem of Exercise 1 is a special case of the above 
statement. 

5. Determine all rational roots of the following polynomials: 

(a) t 1 - 1 (b) t 8 — 1 (c) 2 1 2 - 3t + 4 (d) 3f 3 + t - 5 

(e) 2 f 4 - At + 3 

6. Let p be a prime number. Let 


/(f) = t p ~ 1 + t p ~ 2 + + 1. 

Prove that f(t) is irreducible in Z[f]. [Hint: Observe that 

= l)/(f- 1). 

Let u — t — 1 so t = u + 1. Use Eisenstein.] 

The next two exercises show how to construct an irreducible polynomial over 
the rational numbers, of given degree d and having precisely d — 2 real roots (so 
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of course a pair of complex conjugate roots). The construction will proceed in 
two steps. The first has nothing to do with the rational numbers. 

7. Continuity of the roots of a polynomial. Let d be an integer ^ 3. Let 

be a sequence of polynomials with complex coefficients a ( j n) . Let 
At) — t d T 1 + • • • + a 0 

be another polynomial in C[f]. We shall say that f n (t) converges to f(t) as 
n -» oo if for each j = 0, . . . ,d — 1 we have 

lim af } = cij. 

n ~* cc 


Thus the coefficients of f n converge to the coefficients of f 
Factorize f n and / into factors of degree 1: 

fit) = (t (f - «„) and f K (t) = (t- *<">) ■■■«- «<">). 


(a) Prove the following theorem. 

Assume that f n (t ) converges to f(t ), and for simplicity assume that the roots 
a 1? ..., <x d are distinct. Then for each n we can order the roots ol ( "\ . . . ,ot. ( d n) 
in such a way that for i = 1, ... we have 

lim ot\ n) = a,-. 


This shows that if the coefficients of f n converge to the coefficients of f 
then the roots of f n converge to the roots of f 
(b) Suppose that / and f n have real coefficients for all n. Assume that a 3 ,...,3c d 
are real, and oq, cc 2 are complex conjugate. Prove that for all n sufficiently 
large, a\ n) is real for i = 3 ,...,d; and a ( 2 n) are complex conjugate. 

8. Let d be an integer ^ 3. Prove the existence of an irreducible polynomial of 
degree d over the rational numbers, having precisely d — 2 real roots (and a 
pair of complex conjugate roots). Use the following construction. Let 
b u ...Jb d _ 2 be distinct integers and let a be an integer > 0. Let 

g(t) = ( t 2 + a)(t - - V 2 ) = t d + + ■•• + c 0 . 

Observe that c t eZ. Let p be a prime number, and let 

9n(t) = g(t) + f 
P 


so that g n (t ) converges to g(t). 
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(a) Prove that g n (t) has precisely d — 2 real roots for n sufficiently large. 

(b) Prove that g n (t) is irreducible over Q. 

(Note: You might use the preceding exercise for (a), but one can also give a 
simple proof just looking at the graph of g , using the fact that g n is just a 
slight raising of g above the horizontal axis. 

Obviously the same method can be used to construct irreducible poly- 
nomials over Q with arbitrarily many real roots and pairs of complex 
conjugate roots. There is particular significance for the special case of d — 2 
real roots, when d is a prime number, as you will see in Chapter VII, §4, 
Exercise 15. In this special case, you will then be able to prove that the 
Galois group of the polynomial is the full symmetric group.) 

9. Let R be a factorial ring (see the definition in the next section), and let p be a 
prime element in R. Let d be an integer ^ 2, and let 

f(t) = t d + x t d 1 + • • • + c 0 

be a polynomial with coefficients c, e R. Let n be an integer ^ 1, and let 

g(t)=f(t) + p/p nd - 

Prove that g(t) is irreducible in /C [r J, where K is the quotient field of R. 


IV, §6. PRINCIPAL RINGS AND FACTORIAL RINGS 

We have seen a systematic analogy between the ring of integers Z and 
the ring of polynomials Kit], Both have a Euclidean algorithm; both 
have unique factorization into certain elements, which are called primes 
in Z or irreducible polynomials in K[t\ It turns out actually that the 
most important property is not the Euclidean algorithm, but another 
property which we now axiomatize. 

Let R be an integral ring. We say that R is a principal ring if in 
addition every ideal of R is principal. 

Examples. If R = Z or R = K[t\ then R is principal. For Z this was 
proved in Chapter I, Theorem 3.1, and for polynomials it was proved in 
Theorem 2.1 of the present chapter. 

Practically all the properties which we have proved for Z or for 
K[t] are also valid for principal rings. We shall now make a list. 

Let R be an integral ring. Let peR and p # 0. We define p to be 
prime if p is not a unit, and given a factorization 

p = ab with a, be R 


then a or b is a unit. 
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An element aeR , a^O is said to have unique factorization into 
primes if there exists a unit u and there exist prime elements p t 
(i = 1, ...,r) in R (not necessarily distinct) such that 

a = up x ■ - p r ; 

and if given two factorizations into prime elements 
a = upi-'-Pr = u'qi-'qs, 

then r = s and after a permutation of the indices i, we have = u t q i9 
where u t is a unit i = 1 , ...,r. 

We note that if p is prime and u is a unit, then up is also prime, so we 
must allow multiplication by units in the factorization. In the ring of 
integers Z, the ordering allows us to select a representative prime 
element, namely a prime number, out of two possible ones differing by a 
unit, namely + p, by selecting the positive one. This is, of course, 
impossible in more general rings. However, in the ring of polynomials 
over a field, we can select the prime element to be the irreducible 
polynomial with leading coefficient 1. 

A ring is called factorial, or a unique factorization ring, if it is integral, 
and if every element ^0 has a unique factorization into primes. 

Let R be an integral ring, and a , b e R, a # 0. We say that a divides b 
and write a\b if there exists ceR such that ac = b. We say that deR , 
0 is a greatest common divisor of a and b if d\a, d\b , and if any 
element c of R, c ^ 0 divides both a and b , then c also divides d . Note 
that a g.c.d. is determined only up to multiplication by a unit. 

Proposition 6.1. Let R be a principal ring. Let a, b e R and ab ^ 0. 

Let ( a , b) = (c), that is let c be a generator of the ideal ( a , b). Then c is 

a greatest common divisor of a and b . 

Proof Since b lies in the ideal (c), we can write b = xc for some xeR, 
so that c | b. Similarly, c\a. Let d divide both a and b , and write a = dy , 
b = dz with y, zeR. Since c lies in (< a , b) we can write 

c = xva + tb 

with some w, teR. Then c = wdy + tdz = d(wy T £z), whence d\c, and 
our proposition is proved. 

Theorem 6.2. Let R be a principal ring . Then R is factorial . 

Proof We first prove that every non-zero element of R has a 
factorization into irreducible elements. Given aeR, a ^ 0. If a is prime, 
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we are done. If not, then a = a l b l where neither a l nor b { is a unit. 
Then (a) c= ( a x ). We assert that 


(a) ^ (a,). 

Indeed, if (a) = ( a x ) then a Y — ax for some xeR and then a = axb { so 
xbi = 1, whence both x, b { are units contrary to assumption. If both a u 
b { are prime, we are done. Suppose that a { is not prime. Then 
a l = a 2 b 2 where neither a 2 nor b 2 are units. Then (a { ) c= ( a 2 ), and by 
what we have just seen, ( a J ^ ( a 2 ). Proceeding in this way we obtain a 
chain of ideals 


K) £ (a 2 ) £ (a 3 ) £•••£(«»)£•••■ 


We claim that actually, this chain must stop for some integer n. Let 

J = 0 

n = 1 

Then J is an ideal. By assumption J is principal, so J = (c) for some 
element ceR. But c lies in the ideal ( a n ) for some n, and so we have the 
double inclusion 


(a n ) <= (c) <= (a n ), 

whence (c) = (a n ). Therefore (a n ) = (a n+l ) = ..., and the chain of ideals 
could not have proper inclusions at each step. This implies that a can 
be expressed as a product 

a = p x • • p r where p u • ■ ■ ,P r are P rime - 
Next we prove the uniqueness. 

Lemma 6.3. Let R be a principal ring. Let p be a prime element. Let a , 
beR. If p\ab then p\a or p\b. 

Proof. If p\a then a g.c.d. of p, a is 1, and (p, a) is the unit ideal. 
Hence we can write 

1 = xp + ya 

with some x, yeR. Then b = bxp + yab 9 and since p\ab, we conclude 
that p\b. This proves the lemma. 

Suppose finally that a has two factorizations 


0 = Pi---Pr = 0i”-4 
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into prime elements. Since p { divides the product furthest to the right, it 
follows by the lemma that p { divides one of the factors, which we may 
assume to be q l after renumbering these factors. Then there exists a 
unit u { such that q l = u^p^. We can now cancel p { from both 
factorizations and get 


P2 * Pr = Mi02'"9.- 

The argument is completed by induction. This concludes the proof of 
Theorem 6.2. 

For emphasis, we state separately the following result. 

Proposition 6.4. Let R be a factorial ring . An element p e R, p #0 is 
prime if and only if the ideal (p) is prime. 

Note that for any integral ring R , we have the implication 

ae R, a # 0 and (a) prime => a prime. 

Indeed, if we write a = be with b,ceR then be (a) or ce(a) by 
definition of a prime ideal. Say we can write b = ad with some deR. 
Then a = acd. Hence cd = 1, whence c, d are units, and therefore a is 
prime. 

In a factorial ring, we also have the converse, because of unique 
factorization. In a principal ring, the key step was Lemma 6.3, which 
means precisely that (p) is a prime ideal. 

In a factorial ring, we can make the same definitions as for the 
integers or polynomials. If 


a = up™ 1 • • • p™ r 

is a factorization with a unit u and distinct primes p l5 ...,p r , then we 
define 


m t = ord p («) = order of a at p ( . 


If a , beR are non-zero elements, we say that a , b are relatively prime if 
the g.c.d. of a and b is a unit. Similarly elements a u ... 9 a m are relatively 
prime means that no prime element p divides all of them. If R is a 
principal ring , to say that a , b are relatively prime is equivalent to saying 
that the ideal ( a , b) is the unit ideal. 

Other theorems which we proved for the integers Z are also valid in 
factorial rings. We now make a list, and comment on the proofs, which 
are essentially identical with the previous ones, as in §5. Thus we now 
study factorial rings further. 
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Let R be factorial. Let /(r)efl[r] be a polynomial. As in the case 
when R = Z, we say that / is primitive if there is no prime of R which 
divides all coefficients of f i.e. if the coefficients of / are relatively prime. 

Lemma 6.5. Let R be a factorial ring. Let K be its quotient field. Let 
f eK[t] be a polynomial ^ 0. Then there exists an element ae K, a ^ 0 
such that af is primitive. 

Proof. Follow step by step the proof of Lemma 5.1. No other 
property was used besides the fact that the ring is factorial. 

Lemma 6.6 (Gauss). Let j\ g be primitive polynomials over the factorial 
ring R. Then fg is primitive. 

Proof. Follow step by step the proof of Lemma 5.2. 

Theorem 6.7. Let R be a factorial ring and K its quotient field. Let 
/el?[l] be a primitive polynomial , and deg 1. If f is reducible over 
K then f is reducible over R. More precisely , if f = gh with g , heK[t] 
and deg g ^ 1, deg h ^ 1, then there exist elements a, b e K such that , if 
we let g x = ag and h { = bh, then g u h t have coefficients in R, and 
f=9 1 * 1 - 

Proof. Follow step by step the proof of Theorem 5.3. Of course, at 
the end of the proof, we found ab equal to a unit. When R = Z, a unit 
is ± 1, but in the general case all we can say is that ab is a unit, so a is 
a unit or b is a unit. This is the only difference in the proof. 


If R is a principal ring, it is usually not true that K[r] is also a princi- 
pal ring. We shall discuss this systematically in a moment when we con- 
sider polynomials in several variables. However, we shall prove that 
K[r] is factorial, and we shall prove even more. 

First we make a remark which will be used in the following proofs. 

Lemma 6.8. Let R be a factorial ring , and let K be its quotient field. 
Let 


g = g(t) e R[ f], h = h(t) e K[f] 

be primitive polynomials in R. Let b e K be such that 

g = bh. 

Then b is a unit of R. 
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Proof. Write b = a/d where a , d are relatively prime elements of R , so 
a is a numerator and d is a denominator for b. Then 

dg = ah. 

Write h(t) = c m t m + ■•■ + c 0 . If p is a prime dividing d then p divides acj 
for 7 = 0 ,... ,ra. Since c 0 , ... ,c m are relatively prime, it follows that p 
divides a , which is against our assumption that a , d are relatively prime 
in R. Similarly, no prime can divide a. Therefore b is a unit of R as 
asserted. 

Theorem 6.9. Let R be a factorial ring. Then /?[f] is factorial. The 
units of /?[f] are the units of R. The prime elements of K[f] are either 
the primes of R, or the primitive irreducible polynomials in K[f]. 

Proof. Let p be a prime of R. If p = ab in R[ t] then 

deg a = deg b = 0, 

so a, be R , and by hypothesis a or b is a unit. Hence p is also a prime of 

KM- 

Let p(t) — p be a primitive polynomial in K[f] irreducible in K[tf If 
p = fg with fgeR[t ], then from unique factorization in K[f] we 
conclude that deg / = deg p or deg g = deg p. Say deg / = deg p. Then 
g e R. Since the coefficients of p(t) are relatively prime, it follows that g is 
a unit of R. Hence p(t) is a prime element of R[t]. 

Let f(t)eR[ f]. We can write f = eg where c is the g.c.d. of the 
coefficients of f and g then has relatively prime coefficients. We know 
that c has unique factorization in R by hypothesis. Let 

g = q x ■ • ■ g r 

be a factorization of g into irreducible polynomials q l9 ... 9 q r in K\_t]. 
Such a factorization exists since we know that K\t] is factorial. By 
Lemma 6.5 there exist elements b l9 ... 9 b r e K such that if we let p t = b { q { 
then Pi has relatively prime coefficients in R. Let their product be 

u = b x --b r . 


By the Gauss Lemma 6.6, the right-hand side is a polynomial in K[f] 
with relatively prime coefficients. Since g is assumed to have relatively 
prime coefficients in R 9 it follows that ueR and u is a unit in R. Then 
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is a factorization of / in F[t] into prime elements of F[r] and an ele- 
ment of R. Thus a factorization exists. 

There remains to prove uniqueness (up to factors which are units, of 
course). Suppose that 


f= cpr--p r = dq r --q s , 

where c, deR , and pi,...,p r , g l5 ...,g s are irreducible polynomials in K[r] 
with relatively prime coefficients. If we read this relation in K[t\ and 
use the fact that K[t] is factorial, as well as Theorem 6.7, then we 
conclude that after a permutation of indices, we have r = s and there are 
elements b t e K, i = 1, . . . ,r such that 

p i = b i q i for i = 

Since p h q t have relatively prime coefficients in R, it follows that in fact 
b t is a unit in R by Lemma 6.8. This proves the uniqueness. 

Theorem 6.10 (Eisenstein’s criterion). Let R be a factorial ring , and let 
K be its quotient field . Let 


f(t ) — a n tn + "* + a 0 

be a polynomial of degree n ^ 1 with coefficients in R. Let p be a 
prime , and assume: 

a n ^ 0 (mod p), a t = 0 (mod p) for all i < n , 

^ 0 (mod p 2 ). 

TTiett / is irreducible over K. 

Proof Follow step by step the proof of Theorem 5.4. 

Example. Let F be any field. Let K = F(t ) be the quotient field of 
the ring of polynomials, and let R = F[r]. Then R is a factorial ring. 
Note that t itself is a prime element in this ring. For any element ceF 
the polynomial r — c is also a prime element. 

Let X be a variable. Then for every positive integer n, the polynomial 

f(X) = X n -t 

is irreducible in K[Xf This follows from Eisenstein’s Criterion. In this 
case, we let p = r, and: 

a n = 1 ^ 0 mod t , c/ 0 = — t, a t = 0 for 0 < i < n. 

Thus the hypotheses in Eisenstein’s Criterion are satisfied. 
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Similarly, the polynomial 

\)X 3 + (t- 1 fX 2 + t(t - \)*X - (t - 1) 

is irreducible in K\_X~\. In this case, we let p = t — 1. 

The analogy between the ring of integers Z and the ring of poly- 
nomials F[f] is one of the most fruitful in mathematics. 

Theorem 6.11 (Reduction criterion). Let R be a factorial ring and let K 
be its quotient field. Let f(t ) e F[f] be a primitive polynomial with 
leading coefficient a n which is not divisible by a prime p of R. Let 
R R/pR be reduction mod p, and denote the image of f by /. Let F be 
the quotient field of R/pR. If f is irreducible in F[f] then f is 
irreducible in K[t\. 

Proof The proof is essentially similar to the proof of Theorem 5.5, but 
there is a slight added technical point due to the fact that R/pR is not 
necessarily a field. To deal with that point, we have to use the ring R ip) 
of Exercise 3, so we assume that you have done that exercise. Thus the 
ring R (p) is principal, and R {p) /pR [p) is a field. Furthermore, F = R (p) /pR ip) . 
Now exactly the same argument as in Theorem 5.5 shows that if f is 
irreducible in F[f], then / is irreducible in F (p) [r] and hence / is 
irreducible in /C[t], because K is also the quotient field of R (p) . This 
concludes the proof. 


IV, §6. EXERCISES 

1. Let p be a prime number. Let R be the ring of all rational numbers m/n, 
where m, n are relatively prime integers, and p f n. 

(a) What are the units of K? 

(b) Show that R is a principal ring. What are the prime elements of R1 

2. Let F be a field. Let p be an irreducible polynomial in the ring F[t], Let R 
be the ring of all rational functions f(t)/g(t) such that f g are relatively prime 
polynomials in F[r] and p f g. 

(a) What are the units of R1 

(b) Show that R is a principal ring. What are the prime elements of R1 

3. If you are alert, you will already have generalized the first two exercises to 
the following. Let R be a factorial ring. Let p be a prime elment. Let R (p) be 
the set of all quotients a/b with a, be R and b not divisible by p. Then : 

(a) The units of R (p) are the quotients a/b with a, be R, p f ab. 

(b) The ring R (p) has a unique maximal ideal, namely pR (p) , consisting of all 
elements a/b such that pfb and p\a. 
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(c) The ring R ip) is principal, and every ideal is of the form p m R {p) for some 
integer m ^ 0. 

(d) The factor ring R (p) /pR ip) is a field, which “is” the quotient field of R/pR. 

If you have not already done so, then prove the above statements. 

4. Let R = Z [f]. Let p be a prime number. Show that t — p is a prime element 
in R. Is t 2 — p a prime element in R? What about t 2, — p? What about 
t n — p where n is a positive integer? 

5. Let p be a prime number. Show that the ideal (p, t) is not principal in the 
ring Z [f]. 

6. Trigonometric polynomials. Let R be the ring of all functions / of the form 

n 

f(x) =a 0 + £ (a k cos Lx + b k sin A:x) 

k - 1 

where a 0 , a k , b k are real numbers. Such a function is called a trigonometric 
polynomial. 

(a) Prove that R is a ring. 

(b) If or b n / 0 define n to be the (trigonometric) degree of f. Prove that if 
j\ g are trigonometric polynomials, then 

deg(/0) = deg / + deg g. 

Deduce that R has no divisors of zero, so is integral. 

(c) Conclude that the functions sin x, 1 + cos x, 1 — cos x are prime elements 
in the ring. As Hale Trotter observed (Math. Monthly , April 1988) the 
relation 

sin 2 x = (1 + cos x)(l — cos x) 

is an example of non-unique factorization into prime elements. 

7. Let R be the subset of the polynomial ring Q[f] consisting of all polynomials 

a 0 + a 2 t 2 + a 3 r 3 + ••• + a n t n (so the term of degree 1 is missing), with a { eQ. 

(a) Show that R is a ring, and that the ideal (f 2 , t 3 ) is not principal. 

(b) Show that R is not factorial. 

8. Let R be the set of numbers of the form 

a + byj — 5 with a , be Z. 

(a) Show that R is a ring. 

(b) Show that the map a + £> v — 5 1 — >a — b^/ — 5 of R into itself is an 
automorphism. 

(c) Show that the only units of R are ± 1. 

(d) Show that 3, 2 + ^/ — 5 and 2 — J — 5 are prime elements, and give a 

non-unique factorization 


3 2 = (2 + /^5X2 - sj — 5). 
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(e) Similarly, show that 2, 1 + v — 5 and 1 — ^ — 5 are prime elements, 
which give the non-unique factorization 

2-3 = (l + v ^5)(l -^5). 


9. Let d be a positive integer which is not a square of an integer, so in 
particular, d ^ 2. Let R be the ring of all numbers * + by/ — d with «, be Z. 
Let a = * + byj — d be an element of this ring, and let a be its complex 
conjugate. 

(a) Prove that a is a unit in R if and only if aa = 1. 

(b) If a a = p, where p is a prime number, prove that a is a prime element of 
R. 

10. Let p be an odd prime number, and suppose p = aa, with a e Zf^/ — 1]. 
Prove that p = 1 mod 4. (The converse is also true, but is more difficult to 
prove, in other words: if p = 1 mod 4 then p = aa with some a e Z[^/ — 1].) 

11. Determine whether 3 — 2-Jl is a square in the ring Z[ x r l\ 


IV, §7. POLYNOMIALS IN SEVERAL VARIABLES 

In this section, we study the most common example of a ring which is 
factorial but not principal. 

Let F be a field. We know that F[f] is a principal ring. Let F[f] = R. 
Let t = t i9 and let t 2 be another variable. Since F[f] is factorial, it 
follows from Theorem 6.9 that F[f 2 ] is factorial. Similarly, 




is factorial. This ring is usually denoted by 

and its elements are called polynomials in n variables. Every element of 
F[t u can be written as a sum 

/0l, •••,*«)= Z ( Z a ivin tl-'-ti 

in = o \ll i«— 1 

and we thus see that / can be written 



f(t i....,o= Z/i( f i>---*.-i)^> 

J=0 


where are polynomials in « — 1 variables. 
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As a matter of notation, it is useful to abbreviate 

a ir-i„ — a (i ) 


and write 


= X %)'V 
(0 


The sum is taken separately over all the indices 

0 ShSd l9 0 Si n ^d n . 

Sometimes one also abbreviates 




and one writes the polynomial in the form 


/(o = L<v (0 - 

(0 


The ring F[t l9 ...,£„] is definitely rcor a principal ring for n ^ 2. For 
instance, let n = 2. The ideal 


generated by t { and r 2 in F[^ r 2 ] is not principal. (Prove this as an 
exercise.) This ideal (t ls r 2 ) is a maximal ideal, whose residue class field 
is F itself. You can prove this statement as an exercise. Similarly, the 
ideal (t l9 . is maximal in F[t l9 and its residue class field is F. 

Let / be a polynomial in n variables, and write 

f(h, •••,*„) = X <Vl •••£. Where C «> = C *i— »«• 

We call each term 


a monomial, and if c (f) /0 we define the degree of this monomial to be 
the sum of the exponents, that is 

deg(r‘ l ---f‘") = / 1 + ••• + („. 

A polynomial / as above is said to be homogeneous of degree d if all the 
terms with c (I) ^ 0 have the property that 


i'i +■■■ + /„ = d. 
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Example. The monomial 5t\t 2 t* has degree 8. The polynomial 
is not homogeneous. The polynomial 


is homogeneous of degree 8. 

Given a polynomial f(t l , ...,t n ) in n variables, we can write / as a sum 
f = fo + /i + ’ ' ’ + fd 

where f k is homogeneous of degree k or is 0. By convention, we agree 
that the zero polynomial has degree — oo, because some of the terms f k 
may be 0. To write / in the above fashion, all we have to do is collect 
together all the monomials of the same degree, that is 

/*= Z c <i/; 

i\ H h in = A: 

If f d # 0 is the term of highest homogeneous degree in the sum for / 
above, then we say that / has total degree d , or simply degree d . Just as 
for polynomials in one variable, we have the property: 

Theorem 7.1. Let /, g be polynomials in n variables , and f # 0, g # 0. 
Then 


deg (/#) = deg / + deg 


Proof. Write / = f 0 + ■ ■ ■ +f d and g = g 0 + • • • + § e with f d # 0 and 
9e * 0. Then 


fd ~ fo9o + ' ‘ ‘ + fd9e 

and since the polynomial ring is integral, f d tf e # 0. But f d § e is the 
homogeneous part of / of highest degree, so deg (fy) = d + e, as was to be 
shown. 

Remark. Do not confuse the degree (i.e. the total degree) and the 
degree of / in each variable. If / can be written in the form 

f(t U ...,t„)= Y.f k (t tn-X 


[IV, §7] 


POLYNOMIALS IN SEVERAL VARIABLES 


155 


and f d # 0 as a polynomial in t l9 ...,t n _ 1 then we say that / has degree d 
in t n . For instance, the polynomial 

nt\t 2 tl + ^3 + *3 

has total degree 9, it has degree 3 in f 1? degree 1 in t 2 and degree 5 in f 3 . 

From Theorem 7.1, or from the fact that the ring of polynomials in 
several variables is 




we conclude that this ring is integral. Thus we may form its quotient 
field, just as in the case of one variable. We denote this quotient field by 

and call it the field of rational functions (in several variables). 

The above results go as far as we wish in our study of polynomials. 
We end this section with some comments on polynomial functions. 

Let K n = K x * x K. Just as in the case of one variable, a poly- 
nomial /(f) e K[t l9 . . . ,t„] may be viewed as a function 

/*■: 

Indeed, let x = (x b . . . ,x„) be an n-tuple in K". If 

/(t i, . . . ,t„) = /(f) = £ 

then we define 

f(x 1 , • • • ,X n ) =f(x) = X 0(0*1 • • • *i". 

The map x i — >/(x) is a function of K n into K. But also the map 

/WOO 

is a homomorphism of ^[f f w ] into K , called the evaluation at x. 

More generally, let K be a subring of a commutative ring A Let 
f(t)eK[t !,..., fj be a polynomial. If 


x = (Xj , . . , ,x n ) G 
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is an rc-tuple in A n then again we may define /(x) by the same expres- 
sion, and we obtain a function 

f A »:A n -*A by xk/(x). 

Just as in the case of one variable, different polynomials can give rise to 
the same function. We gave an example of this phenomenon in §1, in 
the context of a finite field. 

Theorem 7.2. Let K be an infinite field. Let feK[t l ,...,t „] be a 
polynomial in n variables. If the corresponding function 

x^f(x) of K n K 

is the zero function , then f is the zero polynomial . Furthermore , if /, g 
are two polynomials giving the same function of K n into K , then f = g. 

Proof. The second assertion is a consequence of the first. Namely, 
given /, g which give the same function of K n into K , let h = f — g. 
Then h gives the zero function, so h = 0 and / = g. We now prove the 
first by induction. Write 


7=0 

Thus we write / as a polynomial in t„, with coefficients which are poly- 
nomials in r i, Let (c 1? ...,c n _ l ) be arbitrary elements of K. By 

assumption, the polynomial 

d 

= Tfj( c 

7=0 

vanishes when we substitute any element c of X for t. In other words, 
this polynomial in one variable t has infinitely many roots in K. There- 
fore this polynomial is identically zero. This means that 

= 0 for all j = 0,...,d 

and all (n — l)-tuples (c 1 ,...,c„_ l ) in K n ~ 1 . By induction, it follows that 
for each j the polynomial f j (t 1 ,...,t n _ l ) is the zero polynomial, whence 
finally / = 0. This concludes the proof. 

Let R be a subring of a commutative ring S. Let x l5 ...,x n be 
elements of S. We denote by 


R\_xi,...,x n ] 
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the ring consisting of all elements /(x 1? ... ,x w ) where / ranges over all 
polynomials in n variables with coefficients in R. We say that 

u •••,*„] 

is the ring generated by x l5 ...,x„ over R. 

Example. Let R be the real numbers, and let cp , ^ be the two 
functions 


(p(x) = sin x and ij/(x) = cos x. 


Then 


RO, 1 1/1 

is the subring of all functions (even the subring of all differentiable 
functions) generated by cp and (//. In fact, R [cp, \jj ] is the ring of 
trigonometric polynomials as in Exercise 6 of §6. 

Let K be a subfield of a field E. Let x l5 ...,x„ be elements of E. As 
we have seen, we get an evaluation homomorphism 

r„] -* E by f{t l ,...,t n )t->f(x l ,...,x„)=f{x). 

If the kernel of this homomorphism is 0, that is if the evaluation map is 
injective, then we say that x l9 ...,x„ are algebraically independent, or that 
they are independent variables over K. The polynomial ring K[t 1? . . . ,t n ~\ 
is then isomorphic to the ring K[x 1? ...,x J generated by x l9 ... 9 x n over 
K. 


Example. It can be shown that if K = Q, there always are infinitely 
many algebraically independent rc-tuples (x l5 ...,x w ) in the complex, or in 
the real numbers. 

In practice, it is quite hard to determine whether two given numbers 
are or are not algebraically independent over Q. For instance, let e be 
the natural base of logarithms. It is not known if e and n are 
algebraically independent. It is not even known if e/n is irrational. 


IV, §7. EXERCISES 

1. Let F be a field. Show that the ideal (t l ,t 2 ) is not principal in the ring 
F[t u t 2 ]. Similarly, show that (t u ...,t n ) is not principal in the ring F[t x , . . . ,r„]. 

2. Show that the polynomial q — t 2 is irreducible in the ring i 2 ]- 
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3. In connection with Theorem 7.2 one can ask to what extent it is necessary 

to have an infinite field. Or even over the real numbers, let S be a subset of 

R". Let f{t u ...,t n ) be a polynomial in n variables. Suppose that f(x) =0 for 

all x e S. Can we assert that / = 0? Not always, not even if S is infinite and 
n ^ 2. Prove the following statement which gives a first result in this 
direction. 

Let K be a field , and let S l5 . ..,S„ be finite subsets of K. Let 

f e K[_t x , . . . dn\ and suppose that the degree of f in the variable t t is ^ d t . 

Assume that #(£*) > d { for i = 1, ...,n. Also suppose that 

/(x A , ...,x n ) = 0 for all x,eS h 

that is f(x) = 0 for all xeS l x • • ■ x S n . Then f — 0. 

The proof will follow the same pattern as the proof of Theorem 7.2. 

Application to finite fields 

4. Prove the following analogue of Theorem 7.2 for finite fields. 

Let K be a finite field with q elements. Let f{t)eK[t l ,...,t n ] be a polynomial 
in n variables , such that the degree of f in each variable t ( is < q. Assume 
that 

f(a u ...,a„) = 0 for all a u ...,a„eK. 

Then f is the zero polynomial. 

Let K be a finite field and let f{t)e K[t u be a polynomial. If t ? 

occurs in some monomial in f replace t ? by t ( . After doing this a finite 
number of times, you obtain a polynomial g(t u . ..,tj such that the degree of g 
in each variable t { is < q. Since x q = x for all xe/C, it follows that f g induce 
the same function of K n into K. By a reduced polynomial associated with / we 
mean a polynomial g such that the degree of g in each variable is < q and 
such that the functions induced by / and g on K n are equal. The existence of 
a reduced polynomial was proved above. 

5. Given a polynomial /e K[t u . . . ,tj prove that a reduced polynomial associated 
with / is unique, i.e. there is only one. 

6. Chevalley’s theorem. Let K be a finite field with q elements. Let 

/(t f„)e t„] 

be a polynomial in n variables of total degree d. Suppose that the constant 
term of / is 0, that is /( 0, ...,0) = 0. Assume that n > d. Then there exist 
elements a u ...,a n e K not all 0 such that f(a 1 ,...,a n ) = 0. [Hint: Think of the 
exercises of §1. Compare the polynomials 

1 -/•>-' and 


and their degrees.] 
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IV, §8. SYMMETRIC POLYNOMIALS 

Let R be an integral ring and let t u ... 9 t n be algebraically independent 
elements over R. Let X be a variable over R[t u .. We form the 
polynomial 


P(X) = (X-t 1 )---(X -t n ) 

= X n -StX"- 1 + ... + (-l)"s„ 

where each s t = s i (t 1 , is a polynomial in Then, for 

instance, 

s i = h H + t n and s n = t x • • • t n . 

The polynomials s l5 . ..,s„ are called the elementary symmetric polynomials 
of ti, ...,r„. 

We leave it as an easy exercise to verify that s f is homogeneous of 
degree i in t l9 ...,t n . 

Let cr be a permutation of the integers (l,...,w). Given a polynomial 
f(t) e K[r] = R[t l9 . . . ,rj, we define cr/ to be 

ofUl* * * * fn) = /( ? a(l)5 ' ' • ^<r(n))* 

If cr, t are two permutations, then (or)/ = cr(i/), and hence the symmetric 
group G on n letters operates on the polynomial ring A polynomial 

is called symmetric if of = / for all creG. It is clear that the set of 
symmetric polynomials is a subring of R[t], which contains the constant 
polynomials (i.e. R itself) and also contains the elementary symmetric 
polynomials We shall see below that these are generators. 

Let X l9 ...,X n be variables. We define the weight of a monomial 

x k : • * . x k - 

1 n 

to be k { + 2/c 2 + ••• + nk n . We define the weight of a polynomial 
g(X l9 ... 9 X n ) to be the maximum of the weights of the monomials 
occurring in g. 

Theorem 8.1. Let f(t) e R[t l9 . . . ,/] be symmetric of degree d. Then 
there exists a polynomial g(X l9 . . . 9 X n ) of weight fg d such that 

f(t) = g(s u ...,s n ). 

Proof By induction on n. The theorem is obvious if n = 1, because 
s i = G- 

Assume the theorem proved for polynomials in n — 1 variables. 
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If we substitute t„ = 0 in the expression for P{X), we find 

(X - t x ) • • • (X - t n . x )X = X n - ( Sl ) 0 X n ~ 1 4- • ■ ■ + (- 1 r l (s„- i)o 

where (s f ) 0 is the expression obtained by substituting t n = 0 in s t . We see 
that (sjo, are precisely the elementary symmetric polynomials 

in t l9 1 . 

JTe now carry out induction on d. If d = 0, our assertion is trivial. 
Assume d > 0, and assume our assertion proved for polynomials of 
degree < d. Let f(t 1 ,...,t n ) have degree d. There exists a polynomial 
g x ( X x , of weight ^ d such that 

fit i, 0) = g 1 ((s 1 ) 0 , ....(s^Jo). 

We note that — : »-s ll _ x ) has degree ^ d in (tj The polynomial 

• * • >0 /(*!, • • • dn) • * • ?^n— l) 

has degree ^ d (in t h ...,t„) and is symmetric. We have 

fi(tu • • • dn-u 0) = 0. 

Hence f x is divisible by t ni i.e. contains t n as a factor. Since f x is 
symmetric, it contains t x • • • t n as a factor. Hence 

/i = s n fzih* • • • A) 

for some polynomial f 2 , which must be symmetric, and whose degree is 
!§ d — n < d. By induction, there exists a polynomial g 2 in n variables 
and weight ^ d — n such that 


fl(tu ■■■,*„) = 02(Sl,.-.,S n ). 

We obtain 

fit) = gi(si,...,s n -i)+ s n g 2 {si ,...,s n ), 

and each term on the right has weight ^ d. This proves our theorem. 

Theorem 8.2. The elementary symmetric polynomials s l5 ...,s w are alge- 
braically independent over R. 

Proof. If they are not, take a polynomial f(X l ,...,X n )eR[X l ,...,X n ] 
of least degree and not equal to 0 such that 


f(s 1, = 0. 
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Write / as a polynomial in X n with coefficients in 

f(X 1 , . . . ,x n ) = f 0 (X u ...,X n _ l ) + -.-+UX l ,... ,X n ^)X d n . 

Then f 0 # 0. Otherwise, we can write 

f(X) = X n h{X ) 

with some polynomial h , and hence s n h(s u . . . ,s n ) = 0. From this it 
follows that h(s u . . .,s„) = 0, and h has degree smaller than the degree of f 
We substitute s f for X t in the above relation, and get 

o = /oOh, • • ■ ,s„- 1 ) + • • • + fd(s u . . . ,s„_ 

This is a relation in R[t u . . .,t„], and we substitute 0 for t n in this 
relation. Then all terms become 0 except the first one, which gives 

0 = /o(( S l)o> * * • >( 5 n- l)o)> 

using the same notation as in the proof of Theorem 8.1. This is a 
non-trivial relation between the elementary symmetric polynomials in 
r i , ... ,t n _ i, a contradiction. 

Example. Consider the product 

A(f !, . . . ,t„) = A(r) = f[ (h ~ */)• 

i<j 

For any permutation g of (1, ...,w), as in Chapter II, Theorem 6.4, we see 
that 

G&(t) = ± A(r). 

Hence A(f) 2 is symmetric, and we call it the discriminant: 

Djis u ... ,s n ) = D(s u . . . ,s„) = f] (r ,• - tj) 2 = A(r) 2 . 

‘<j 

We thus view the discriminant as a polynomial in the elementary 
symmetric functions. 

Let F be a field, and let 

P(X) = (X - *,)•*•(* - a„) = X" - c,*"- 1 + ... + (-1 fc n 

be a polynomial in F[X] with roots a 1? . . . ,oc w g F. Then there is a unique 
homomorphism 


Z[fi, -► F mapping 


t t i— ► . 
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This homomorphism is the composite 

z[t tj -*■ F, 

where the first map is induced by the unique homomorphism Z -> F, and 
the second map is the evaluation homomorphism sending 1 — a f -. 

Under this homomorphism, we see that 

A(fj, . . . ,f M ) i ► A(ocj, . . . ,oc n ) 

and so 

D(si, . . . ,s n ) D(c 1? . . . ,c„). 

Therefore, to get a formula for the discriminant of a polynomial, it 
suffices to find the formula for a polynomial over the integers Z, with 
algebraically independent roots We shall now find such a 

formula for polynomials of degree 2 and 3. 

Example (Quadratic polynomials). Let 

f(X) = X 2 4- bX 4- c = (X - t x )(X - t 2 ). 

Then by direct computation, you will find that 


D f = b 2 — 4c. 


Example (Cubic polynomials). Consider a cubic polynomial 


f(X) = X 3 - s,X 2 + s 2 X -s 3 =(X- t,)(X - t 2 )(X - t 3 ). 


We want to find a formula for the discriminant, which is more 
complicated than in the quadratic case. There is such a formula, but in 
practice, it is best to make a change of variables first and reduce the 
problem to the case of a cubic whose X 2 term is missing. Namely, let 

Y = X — ^s\ so X = Y + ^ = Y + 3(^1 + 1 2 + £3) . 

Then the polynomial f(X) becomes 

f(X)=f*(Y)= Y 3 + aY + b = (Y — u { )(Y — u 2 )(Y — u 3 ) 

where a = u x u 2 4 - u 2 u 3 + u 1 w 3 and b = — w 1 w 2 w 3 , while u { 4 - u 2 + w 3 = 0. 
We have 


= t, - 3S1 


for i = 1, 2, 3. 
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Note that the discriminant is unchanged because the translation by sj 3 
cancels out. Indeed, 


Ui — Uj = t ( — tj for all i # j. 

If we can get a formula for the discriminant of the cubic whose square 
term is missing, as a function of a and b, then we can get a formula for 
the discriminant of the general cubic simply by substituting the values of 
a and b as functions of s u s 2 , s 3 . You will work this out as Exercise 1. 
We now consider the cubic whose square term is missing. So we let 

f(X) = X 3 + aX + b = (X — u x )(X - u 2 ){X - u 3 ) 

where u l , u 2 are independent variables, and u 3 = — (u x + u 2 ). Then the 
discriminant is 

D =(u l - u 2 ) 1 (u l - u 3 ) 2 (u 2 - u 3 ) 2 . 

As a function of the elementary symmetric functions a , b the discriminant 
is 


D = -4a 3 - 21b 2 . 


As an exercise, try to determine this by brute force. We shall now give a 
proof which eliminates the brute force. 

Observe first that D is homogeneous of degree 6 in u u u 2 . Further- 
more, a is homogeneous of degree 2 and b is homogeneous of degree 3. 
By Theorem 8.1, we know that there exists some polynomial 

g(X 2 ,X 3 ) of weight 6 

such that D = g(a y b). The only monomials X™ X 3 of weight 6, i.e. such 
that 


2m + 3n = 6 with integers m, n ^ 0 
are those for which m = 3, n = 0 or m = 0, n = 2. Hence 

g(X 2 ,X 3 )=vX 3 2 + wX 2 

where v , w are integers which must now be determined. 

Observe that the integers v , w are universal, in the sense that for any 
special polynomial with special values of a , b , its discriminant will be 
given by 


g(a , b) = va 3 + wb 2 . 
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Consider the polynomial 

MX) = x(x - \)(x +\) = x 3 -x. 

Then a = —l, b — 0 and Dj x = va 3 — —v. But also = 4 by using 
the definition of the discriminant as the product of the differences of the 
roots, squared. Hence we get 


v = -4. 


Next consider the polynomial 


fi(X) = X 3 -L 


Then a = 0, b = —1, and D fi = wb 2 = w. But the three roots of f 2 are 
the cube roots of unity, namely 


-i + -i ~ N -3 
’ 2 ’ 2 

Using the definition of the discriminant as the product of the differences 
of the roots, squared, we find the value D fi = — 27. Hence we get 

w = -27. 

This concludes the proof of the formula for the discriminant of the cubic. 


IV, §8. EXERCISES 

1. Let f{X ) = X 3 + a A X 2 + a 2 X + a 3 . Show that the discriminant of / is 


a\d\ — \a\ — 4 — 21 a\ + 18 a l a 2 a 2f . 


[Reduce the question to the case of a polynomial 7 3 + aY + b, and use the 
formula for this special case.] 

2. Try to work out the formula for the discriminant of X 3 + aX + b by brute 
force. 

3. Show that the discriminant of a polynomial is 0 if and only if the polynomial 
has a root of multiplicity > 1. (You may assume that the polynomial has 
coefficients in an algebraically closed field.) 

4. Let f(X) = (X — a x ) ■ ■ ■ (X — aj. Show that 


0/ = ( — i r (n ~ 0/2 fl /'(«,•)• 

j = 1 
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IV, §9. THE MASON-STOTHERS THEOREM 

Bibliographical references will occur at the end of §10. 

The first part of this section presents a theorem for polynomials dis- 
covered in 1981 by Stothers [Sto 81]. He used fundamental tools from alge- 
braic geometry, which give a lot of insight not only in the particular theorem 
but into the possibilities of extending it to more general situations. How- 
ever, partly because of the depth of the method, few people were aware of 
Stothers’ result at the time. An elementary proof was discovered by Mason 
in 1983 [Mas 83]. Then Noah Snyder gave the simplest known proof 
[Sny 00], and we present his proof here. 

We first work over an algebraically closed field of characteristic 0, the 
complex numbers if you wish. 

Let f(t) be a non-zero polynomial, with its factorization 

(i) /(o = cn(f-« i r = c(/-«,r---(f-a r r, 

i=i 

with a non-zero constant c, and the distinct roots a,- (i = 1 , . . . , r). As before, 
we call m/ the multiplicity of a,-. Let a be a constant. If cl is not a root, that is 
a # a / for all i, then /(a) # 0. Suppose a is a root. It is convenient to write 
the factorization of f(t) in the form 

m = (t- «) m{a] g(t) 

where g( cl) # 0 and m( cl) is the multiplicity of a. If a = for some index k , 
then 

fit) = c(t - a k ) mk p[(? - a i) m ‘, 


and g(t) = c f] it-^i) m ‘- 

i^k 

We defined a to be a multiple root of / if and only if m(ct) ^ 2. If 
m( a) = 1 , we say that a is a simple root of /. 

Directly from the definitions, if f(t) = (t — ct) m ^ with the multiplicity 
m( a) ^ 1, then the multiplicity of a in f'(t) is m( a) — 1. We generalize this 
statement to an arbitrary polynomial. 

Lemma 9.1. Let f (t) be a polynomial over an algebraically closed field. 

Let cl be a root of f with multiplicity m(a). Then the multiplicity of cl in 

f'(t) is m{ct) - 1. 

Proof Write f{t) = (t — a ) m g{t) with #(a) # 0. By the rule for the deriv- 
ative of a product, we get 
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f'(t) = (t - <x) m g'{t) + m(t - <x) m 'g(t) 

= [t - ~ o.)g' (t) +mg(t )) = (t - a) m ~ l h(t) 

where h(t) = ((f — oc)g'(t) + mg(t)). Note that /i(a) — mg(a) = g(a) / 0. 
Hence ( X — <x) m ~ l is the highest power of ( t — a) dividing /'(/), so m — 1 
is the multiplicity of a in as was to be shown. 

Define ho(/) = number of distinct roots of f, so no(/) = r in the factori- 
zation (1). 

Proposition 9.2. Let fbe a non-constant polynomial. Suppose that f(t ) has 
the factorization (1). Then the g.c.d.(f,f') is 


i= 1 


with some constant c\. In particular , 

deg (/, /') = deg / - n 0 (/). 

Proo/ The only prime polynomials which occur in the factorization of 
the greatest common divisor of /,/' must be among the prime polynomials 
t — CLi. Lemma 9.1 gives us the multiplicities, so the factorization is as 
stated. The degree comes from subtracting 1 for each /, so the formula for 
deg(/, f) falls out. 

The results of this section so far determining the g.c.d. of a polynomial 
and its derivative was carried out over an algebraically closed field, when 
the polynomial factors into irreducible factors of degree 1. One can carry 
out essentially the same arguments over any field F , especially over the ratio- 
nal numbers themselves. In this more general case, irreducible (prime) poly- 
nomials may have a degree > 1 , and this degree must be taken into account. 
Suppose / has the factorization 


(2) fit) = c^\pi{t) mt , 

i= 1 

with a constant c / 0 and prime polynomials in F[t\. We define 


n o (/) = 5Z de S Pi- 

i = 1 
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There is another notation which simplifies the use of indices, namely 

«o (/) = X! deg P- 

P\f 

The sum is taken over all the prime polynomials dividing /, and thus we 
take the sum of the degree of all such polynomials. The analogue of Propo- 
sition 9.2 reads as follows. 

Proposition 9.3. Let f be * non-constant polynomial in F[t\. Suppose f(t) 
has the factorization (2). Then the g.c.d.(f,f r ) is 


g.c.d.(f,f') = cit[Pi{t)* h ~ l 
1 = 1 


with some constant c \ . In particular , 

deg(/,/') = deg /-«£(/)• 

Proof Write it down yourself. It’ll keep you in shape. 

We now return to an algebraically closed field of characteristic 0. 

It is immediate that if /, g are non-zero polynomials, then 

«0 (fg) ^ «o(/) + no(g), 

If in addition /, g are relatively prime, then we actually have an equality 

ho (fg) = n 0 {f)+n 0 (g). 

It is obvious that deg / can be very large, but n 0 (f) may be small. For 
instance, 


/(r) = (/-a ) 1000 

has degree 1000, but m(f) = 1. The Mason-Stothers gives a remarkable 
additive condition under which the degree cannot be large. 

Theorem 9.4 (Mason-Stothers). Let f,g,h be non-constant relatively prime 
polynomials satisfying f + g — h. Then 


deg /, deg g, deg h g n 0 (fgh) - 1. 
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The theorem shows in a very precise way how the additive relation 
/ + g = h implies a bound for the degrees of /, g , h , namely the number of 
distinct roots of the polynomial fgh , even with —1 tacked on. 

Proof. (Noah Snyder [Sny 00]) We first note the identity 

f'g-fg' = f'h-fh'. 

To prove it, recall the property of derivatives, which gives f + g' = h'. 
Then 


fg - fg 1 = /'(/ + g) - /(/' + g') (because f'f cancels ff) 

= f'h — fh' 

which proves the desired identity. 

We have fg - fg' # 0, otherwise fg = fg' ^ 0 because f,g are as- 
sumed non-constant. Since f,g are relatively prime, this would imply g\g\ 
which is impossible. Then we notice that the g.c.d. (/, /') divides the left side 
of the identity, (g, g f ) divides the left side, and (/i, h!) divides the right side, 
which is equal to the left side. Therefore, since /, g , h are relatively prime, 

the product (f,f')(g,g')(h,h') divides fg- fg'- 

This yields an inequality between the degrees, namely, 

(*) deg (/,/') + deg(g, g') + deg(h, h!) S deg(/'gf - fg) 

g deg / + deg g - 1 . 

We now use Proposition 2.2, namely the identity applied to /, g,h: 

deg (/,/') =deg 
deg (g,g') = deg g- n 0 (g) 
deg (h, h') = deg h - no{h). 

We substitute these values in (*) above. We then cancel deg / and deg g 
from both sides, and move n 0 (f),rio(g),no{h) to the right side, thus getting 

deg h ^ no(f) + n 0 (g) + no(h) - 1 = n 0 (fgh) - 1 

since /, g , h are relatively prime. Since /, g , h enter essentially symmetrically 
in the equation / + g = h, the same inequality is satisfied for deg / and 
deg g , thus concluding the proof. 
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In the XI th century, Fermat claimed to have proved that the equation 


x n + y n = z n 

has no solution in positive integers if n ^ 3. He never commented again on 
this claim, and nobody was able to prove it until 1995 when Wiles gave a 
proof based on very deep and extensive mathematical theories developed 
during the decades 1960-1990. Fermat’s claim was usually called Fermat’s 
last theorem, but is now Wiles’ theorem. 

The analogue for polynomials has been known for some time, at least 
since the 19 th centui 7 . The first proof was based on more advanced tech- 
niques of algebraic geometry. However, it is an immediate consequence of 
the Mason-Stothers theorem, as we now shall see. 

Theorem 9.5. Let n be an integer ^ 3. There is no solution of the equation 

u n + v n = w" 


with non-constant relatively prime polynomials u , v, w. 

Proof Let / = u n , g = v n , and h — w n . Then the Mason-Stothers theo- 
rem yields 

deg u n ^ no(u n v n w n ) — 1. 

However, deg u n = n • deg u and no(u n ) = no(f) ^ deg u. Hence 

n • deg u ^ deg u + deg v + deg w — 1 . 

Similarly, we obtain the analogous inequality for v and w, that is 

n • deg v ^ deg u + deg v + deg w — 1 
n • deg w ^ deg u + deg v + deg w — 1. 

Adding the three inequalities yields 

n(deg uvxv) ^ 3(deg uvxv) — 3 < 3(deg uvxv). 

Cancelling deg uvw yields n < 3, so n ^ 2 since n is an integer, thus proving 
the theorem. 


Of course, you should know that the equation 


x 2 + y 2 = z 2 
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has infinitely many solutions in integers, and also has a solution in poly- 
nomials. The solutions are the sides of what’s called a Pythagorean triangle. 
For instance, x = 3, y — 4,z = 5 is a solution. With polynomials, we can 
put 

u = 1 — t 2 , v = 2t. w = 1 + t 2 . 

Then you can verify at once that u 2 + v 2 = w 2 . Substituting integers for t 
then gives infinitely many Pythagorean triangles. Do the exercises. See 
[Lan 85] for a general discussion of Pythagorean triples. 


IV, §9. EXERCISES 

1. Let p be a prime polynomial. Show that g.c.d .(/?, p f ) = 1 . 

2. Prove Proposition 9.3. 

3. Take the rational numbers Q for a field F. With the nfi defined at the end of 
Chapter III, state and prove the version of the Mason-Strothers theorem for poly- 
nomials in Q[t]. 

4. Pursuing the analogy between integers and polynomials, how would you define the 
analogue of no for a positive integer *? Suppose * has the prime factorization 

a = P\ x '"P? r ‘ 

What’s a reasonable way to define «o(*)? Be careful. The number of distinct roots 
of a polynomial was a good definition when we could factor the polynomial in 
terms of prime polynomials which have degree 1. However, one cannot always 
do this for polynomials with rational coefficients because there are plenty of irre- 
ducible polynomials of degree > 1 . 

5. Give at least three solutions in relatively prime integers for the equation 
x 2 4- y 2 = z 2 . 

6. Prove Davenport’s theorem [Dav 65]: Let u,v be two non-constant relatively 
prime polynomials such that w 3 — v 2 # 0. Show that 

\ deg u ^ deg(w 3 — v 2 ) — 1 
^ deg v ^ deg(« 3 - v 2 ) - 1. 

Final Remark. The precise analogue of the Mason-Stothers theorem is not true 
for integers. Masser-Oesterle have made a conjecture which is a perturbation of 
the Mason-Stothers inequality. For this conjecture and its history, see the next 
section. 
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IV, §10. THE abc CONJECTURE 

In this section we describe a great contemporary conjecture. Bibliographical 
references are listed at the end of the section. 

In the preceding section, we met the Mason-Stothers theorem for polyno- 
mials. One of the most fruitful analogies in mathematics is that between the 
integers Z and the ring of polynomials F[/]. Evolving from the insights of 
Mason, Stothers, Frey [Fr], Szpiro, and others, Masser and Oesterle formu- 
lated the abc conjecture for integers as follows. Let A: be a non-zero integer. 
Define the radical of k to be 


No(k) = n^’ 

p\k 

i.e., the product of all the primes dividing k, taken with multiplicity 1. 

The abc conjecture. Given e > 0 there exists a positive number C(s) 
having the following property. For any non-zero relatively prime integers 
a , b , c such that a + b = c we have 

max(|a|, \b\, |c|) ^ C{z)N 0 {abc) x +£ . 

Observe that the inequality says that many prime factors of a , b , c occur 
to the first power, and that if “small” primes occur to high powers, then 
they have to be compensated by “large” primes occurring to the first 
power. For instance, one might consider the equation 

2" ± 1 = k. 

For n large, the abc conjecture would state that k has to be divisible by 
large primes to the first power. This phenomenon can be seen in the 
tables of [BLSTW]. 

Stewart and Tijdeman [ST 86] have shown that it is necessary to have the 
£ in the formulation of the conjecture and they gave a lower bound for 
e(No). Subsequent examples were communicated to me by Wojtek Jastrze- 
bowski and Dan Spielman as follows. We have to give examples such that 
for all C > 0 there exist natural numbers u, c relatively prime such that 
a + b — c and a ^ CNo(abc). But trivially, 

2”|(5 2 ” - 1). 

We consider the relations a n + b n = c n given by 


(5 2 ” - 1) + 1 = 5 2 '\ 
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It is clear that these relations provide the desired examples. Alan Baker 
has conjectured that max(|a|, |fe|, \c\) ^ CN 0 0(No), where 0(Nq) is the num- 
ber of integers n , 0 < n ^ No, divisible only by primes dividing TVq, and C 
is an absolute constant [Bak 98]. Valentin Blomer then showed that 
log 0(N) ^ (log 4 + o(l))(log AO/log log N. 

The abc conjecture implies what we shall call the asymptotic Fermat 
theorem, that for all but a finite number of n, the equation 

x" + y n = z n 

has no solution in relatively prime integers x, y, z. Indeed, we have by 
the abc conjecture 

\x n \<\xyz\ {+ \ \y»\<\xyz\' +e , \z"\<\xyz\ l+e , 

where the sign means that the left-hand side is ^ C(e) times the 
right-hand side. Taking the product yields 

|xyz|" |xyz| 3+€ , 

whence for |xyz| > 1 we get n bounded. The extent to which the abc 
conjecture is proved with an explicit constant C(c) (or say C(l) to fix 
ideas) yields the corresponding explicit determination of the bound for n 
in the application. 

We shall now see how the abc conjecture implies other conjectures by 
Hall, Szpiro , and Lang -W aid schmidt. 


Hall’s original conjecture is that if u , v are relatively prime non-zero 
integers such that u 3 — v 2 / 0 then 

| u 3 — v 2 \ > | w | 1/2 

Such an inequality determines a lower bound for the amount of 
cancellation that can occur in a difference u 3 — v 2 . 

Note that if | u 3 — v 2 \ is small, then |w 3 | \v 2 \ so \v\ $><€ \u\ 3/2 . More 

generally, following Lang-Waldschmidt, let us fix A , B and let u, v , k, m, 
n be variable with mn > m + n. Put 

Au m + Bv n = k. 


By the abc conjecture, we get 

\u\ m \uvN(k)\ 1+e and \v\ n \uvN(k )\ l+e . 
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If, say, \Au m \ ^ \Bv n \, then 

M" < \v l+nlm N 0 (k)\ l+c 

whence 

(1) |i>| V 0 (/c) mn "* m+ ' , ) <l +El and |u| /V 0 (/c) mn ~< m + "> " + £> . 

The situation is symmetric in u and v. Again by the abc conjecture, we 
have |fc| <g \uvN 0 (k)\ l +e , so by (1) we find 


( 2 ) 


mn 

\k\ <£ N 0 {k) mn - {m + n) 


(1 + e> 


We give a significant example. 

Example. Take m = 3 and n = 2. From (1) we get the Hall conjecture, 
by weakening the upper bound, replacing N 0 (k) with k. Observe also that 
if we want a bound for integral relatively prime solutions of y 2 = x 3 4 - b 
with integral b, then we find \x\ ^ \b\ 2+ \ Thus the abc conjecture has 
a direct bearing on the solutions of diophantine equations of classical 
type. 

Again take m = 3, n = 2 and take A = 4, B = — 27. In this case, we 
write D instead of k, and find for 


D = 4u 3 - 21v 2 

that 

(3) |uNiV 0 (D) 2 + £ , | v\<N 0 (D) 3+ *. 

These inequalities are supposed to hold at first for u , v relatively prime. 
If one allows an a priori bounded common factor, then (3) should also 
hold in this case. We call (3) the generalized Szpiro conjecture. 

The original Szpiro conjecture was 

|D|<^N 0 (D) 6 + £ , 

but the generalized conjecture actually bounds \u\, \v\ in terms of the 
“right” power of N 0 {D\ not just |D| itself. 

The current trend of thoughts in this direction was started by Frey 
[Fr], who associated with each solution of a -f b = c the polynomial 


f(x) = .y(.y 3a)(x + 3b). 
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The discriminant of the right-hand side is the product of the differences 
of the roots square, and so 

D = 3 6 (abc)\ 

We make a translation £ = x + b — * to get rid of the x 2 term, so that 
our equation can be rewritten 

ri 2 = i 3 ~y 2 Z-? 3 , 

where y 2 , y 3 are homogeneous in *, b of appropriate weight. Then 

D = 4yl- 21 y l- 

The use of 3 in the Frey polynomial was made so that y 2 , y 3 come out to 
be integers. You should verify that when b, c are relatively prime, then 
y 2 , y 3 are relatively prime, or their greatest common divisor is 9. (Do 
Exercise 1.) 


The Szpiro conjecture implies asymptotic Fermat. Indeed, suppose that 


a = if, b = if, and c = w n . 


Then 


4y 2 - 21y 2 3 = 3 6 (uvw) 2 \ 


and we get a bound on n from the Szpiro conjecture \D\ N 0 (D) 6+e . Of 

course any exponent would do, e.g. \D\ ^ /V 0 (D) 100 for asymptotic 
Fermat. 

We have already seen that the *bc conjecture implies generalized 
Szpiro. 

Conversely , generalized Szpiro implies abc. Indeed, the correspondence 
between 

(a, b)*->(y 2 , y 3 ) 


is “invertible,” and has the “right” weight. A simple algebraic manipula- 
tion shows that the generalized Szpiro estimates on y 2 , y 3 imply the 
desired estimates on |*|, \b\. (Do Exercise 2.) 

From this equivalence, one can use the examples given at the 
beginning to show that the epsilon is needed in the Szpiro conjecture. 

Hall made his conjecture in 1971, actually without the epsilon. The final 
setting of the proofs in the simple a be context which we gave above had to 
await Mason and the abc conjecture a decade later. 

Let us return to the polynomial case and the Mason-Stothers theorem. 
The proofs that the abc conjecture implies the other conjectures apply as well 
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in this case, so Hall, Szpiro, and Lang-Waldschmidt are also proved in 
the polynomial case. Actually, it had already been conjectured in 
[BCHS] that if f, § are non-zero polynomials such that / 3 — § 2 # 0 then 

deg(/(t) 3 — j(t) 2 ) ^ j deg /(f) + 1. 

This (and its analogue for higher degrees) was proved by Davenport 
[Dav] in 1965. As for ordinary integers, the point of the theorem is to 
determine a lower bound for the cancellations which can occur in a 
difference between a cube and a square, in the simplest case. The result 
for polynomials is particularly clear since, unlike the case of integers, 
there is no extraneous undetermined constant floating around, and there 
is even + 1 on the right-hand side. 

The polynomial case as in Davenport and the Hall conjecture for 
integers are of course not independent. Examples in the polynomial case 
parametrize cases with integers when we substitute integers for the 
variable. Examples are given in [BCHS], one of them due to Birch 
being: 

f(t) = C + 4 1 4 + 10r 2 + 6 and g{t) = t 9 + 6t 7 4- 2 U 5 + 35 1 3 + 
whence 

degree(/(t) 3 - §(t) 2 ) = jdeg/ + 1. 

Substituting large integral values of t = 2 mod 4 gives examples of large 
values for x 3 — y 2 . A fairly general construction is given by Danilov 
[Dan]. 


IV, §10. EXERCISES 

1. Prove the statement that if a , b, c are relatively prime and a + b = c, then y 2 , 
y 3 are relatively prime or their g.c.d. is 9. 

2. Prove that the generalized Szpiro conjecture implies the abc conjecture. 

3. Conjecture. There are infinitely many primes p such that 2 p ~ l ^ 1 mod p 2 . 
(You know of course that 2 p ~ l = 1 mod p if p is an odd prime.) 

(a) Let S be the set of primes such that 2 p ~ l ^ 1 mod/? 2 . If n is a positive 
integer, and p is a prime such that 2 n — 1 = pk with some integer k prime 
to p, then prove that p is in S. 

(b) Prove that the abc conjecture implies the above conjecture. (Silverman, J. 
of Number Theory , 1988.) 

Remark. A conjecture of Lang-Trotter implies that the number of primes 
p^x such that 2 p ~ l = 1 mod p 2 is bounded by C log log x for some constant 
C > 0. So most primes would satisfy the condition that 2 p ~ l 0 1 mod p 1 . 
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[IV, §10] 


[Bak 98] 
[BCHS] 
[BLSTW] 

[Dan] 

[Dav] 

[Fr] 

[Ha] 

[La 85] 

[La 90] 

[La 99] 
[Mas 83] 

[Sny 00] 
[StT 86] 
[Sto 81] 

[Ti 89] 

[Za 95] 
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Readers acquainted with basic facts on vector spaces may jump immedi- 
ately to the field theory of Chapter VII. 


CHAPTER V 


Vector Spaces and Modules 


V, §1. VECTOR SPACES AND BASES 

Let K be a field. A vector space V over the field K is an additive 
(abelian) group, together with a multiplication of elements of V by ele- 
ments of K , i.e. an association 


(x, v) t— > xv 

of K x V into V, satisfying the following conditions: 

VS 1. 7/1 is the unit element of K , then \ v = v for all veV 

VS 2. If ceK and v, we V, then c(v + w) = cv + cw. 

VS 3. If x, ye K and veV 9 then (x -1- y)v = xv + yv. 

VS 4. If x, yeK and veV, then (xy)v = x(yv). 

Example 1. Let V be the set of continuous real-valued functions on 
the interval [0, 1]. Then V is a vector space over R. The addition of 
functions is defined as usual: If f g are functions, we define 

if + Q\t) = fit) T g(t). 

If ceR, we define ( cf)(t) = cf{t). It is then a simple routine matter to 
verify that all four conditions are satisfied. 

Example 2. Let S be a non-empty set, and V the set of all maps of S 
into K. Then V is a vector space over 7C, the addition of maps and the 
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multiplication of maps by elements of K being defined as for functions in 
the preceding example. 

Example 3. Let K n denote the product K x • • • x K, i.e. the set of n - 
tuples of elements of K. (If K = R, this is the usual Euclidean space.) 
We define addition of n-tuples componentwise, that is if 

X = (x u ...,x n ) and Y = (y u . . . ,y„) 

are elements of K n with x i9 y t e K , then we define 

X + Y = (x i 4- yi,...,x n + y n ). 

If ceK , we define 

cX = (cx cx n ). 

It is routine to verify that all four conditions of a vector space are 
satisfied by these operations. 

Example 4. Taking n = 1 in Example 3, we see that K is a vector 
space over itself. 

Let V be a vector space over the field K. Let veV. Then Ov = 0. 

Proof. Ov + v = Ov + iv = (0 + \)v = \v = v. Hence adding — v to 
both sides shows that Ov = 0. 

If ceK and cv = 0, but c / 0, then v = 0. 

To see this, multiply by c _1 to find c~ l cv = 0 whence v = 0. 

We have (— l)t> = —v. 

Proof 


(—\)v + v = ( — \)v+ 1 v = ( — 1 4- \)v = Ov = 0. 


Hence (- l)r = — v. 

Let V be a vector space, and W a subset of V We shall say that W 
is a subspace if IT is a subgroup (of the additive group of V\ and if 
given ceK and veW then cv is also an element of W. In other words, a 
subspace W of V is a subset satisfying the following conditions: 

(i) If w, w are elements of W, their sum v + w is also an element of 
W. 
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(ii) The element 0 of V is also an element of W. 

(iii) If v e W and ceK then cv e W. 

Then W itself is a vector space. Indeed, properties VS 1 through VS 4, 
being satisfied for all elements of V, are satisfied a fortiori for the ele- 
ments of W. 

Let V be a vector space, and w l ,...,w n elements of V. Let W be the 
set of all elements 

*l w l + *•* + 

with e K. Then W is a subspace of V, as one verifies without difficulty. 
It is called the subspace generated by w l9 ...,w n , and we call w l5 ...,w n 
generators for this subspace. 

Let V be a vector space over the field K , and let v l ,...,v n be elements 
of V. We shall say that v u ...,v n are linearly dependent over K if there 
exist elements a l ,...,a n in K not all equal to 0 such that 

+ ■•• + a n v n = 0. 

If there do not exist such elements, then we say that v u ...,v n are linearly 
independent over K. We often omit the words “over K”. 

Example 5. Let V = K n and consider the vectors 

v 1 = ( 1 , 0 ,..., 0) 

v n = (0, 0 1). 

Then are linearly independent. Indeed, let a l9 ...,a„ be elements 

of K such that a l v 1 + •■■ + a n v n = 0. Since 


a 1 v 1 + ■■• + a n v„ = (a 1? ...,a M ), 


it follows that all a x = 0. 

Example 6. Let V be the vector space of all functions of a real vari- 
able t. Let be n functions. To say that they are linearly de- 

pendent is to say that there exist n real numbers a x , . . . ,a n not all equal 
to 0 such that 


aJM + ••• + a n f„(t)= 0 


for all values of t. 

The two functions e\ e 2t are linearly independent. To prove this, 
suppose that there are numbers a, b such that 


ae l -I- be 2t = 0 
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(for all values of t). Differentiate this relation. We obtain 

ae l + 2 be 2t = 0. 

Subtract the first from the second relation. We obtain be 2t = 0, and 
hence b = 0. From the first relation, it follows that ae l = 0, and hence 
a = 0. Hence e\ e 2t are linearly independent. 

Consider again an arbitrary vector space V over a field K. Let 
v l9 ...,v n be linearly independent elements of V. Let x l9 ...,x n and 
y l9 ...,y n be numbers. Suppose that we have 


+--- + X nV n =y 1»1 +--- + y n »n- 


In other words, two linear combinations of v u ...,v n are equal. Then we 
must have x t = y t for each i = 1, . . . ,n. Indeed, subtracting the right-hand 
side from the left-hand side, we get 

X l V l ~ y l V l + * ‘ * + *n»n - y*i V n = 0 . 

We can write this relation also in the form 


(*i - yi)»i + • • + (x„ - y„)v n = 0. 


By definition, we must have — y t = 0 for all i = 1, ...,w, thereby prov- 
ing our assertion. 

We define a basis of V over K to be a sequence of elements {v u . . . ,v n } 
of V which generate V and are linearly independent. 

The vectors v l9 ...,v n of Example 5 form a basis of K n over K. 

Let W be the vector space of functions generated over R by the two 
functions e\ e 2t . Then { e\ e 2t } is a basis of W over R. 

Let K be a vector space, and let {v u ...,v^ be a basis of V The 
elements of V can be represented by n-tuples relative to this basis, as 
follows. If an element v of V is written as a linear combination 

V = X l V l + *•• + X n»n 

of the basis elements, then we call the coordinates of v with 

respect to our basis, and we call x t the i-th coordinate. We say that the 
rc-tuple X = (x 1? ...,x M ) is the coordinate vector of v with respect to the 
basis {v i ,...,v n }. 

For example, let V be the vector space of functions generated by the 
two functions e\ e 2t . Then the coordinates of the function 

3e' + 5e 2 ' 

with respect to the basis {e\ e 2t j are (3, 5). 


[V, §1] 


VECTOR SPACES AND BASES 


181 


Example 7. Show that the vectors (1,1) and ( — 3, 2) are linearly inde- 
pendent over R. 

Let a, b be two real numbers such that 

a(l,l) + 6(-3, 2) = 0. 

Writing this equation in terms of components, we find 

a — 3b = 0, 
a + 2b = 0. 

This is a system of two equations which we solve for a and b . Subtract- 
ing the second from the first, we get — 5b = 0, whence b = 0. Substitut- 
ing in either equation, we find a = 0. Hence a, b are both 0, and our 
vectors are linearly independent. 

Example 8. Find the coordinates of (1,0) with respect to the two 
vectors (1, 1) and (— 1, 2). 

We must find numbers a, b such that 

o(l,l) + « — 1,2) = (1,0). 

Writing this equation in terms of coordinates, we find 

a - fr = 1 , 
o + 2b = 0. 

Solving for o and b in the usual manner yields b = and a = §. Hence 
the coordinates of (1,0) with respect to (1, 1) and ( — 1, 2) are (f, — ^). 

Let {v l5 . . . ,v n } be a set of elements of a vector space V over a field K. 
Let r be a positive integer ^ n. We shall say that {v l9 ... 9 v r } is a 
maximal subset of linearly independent elements if v l9 ... 9 v r are linearly 
independent, and if in addition, given any v t with i > r, the elements 
v l9 ...,v r9 Vi are linearly dependent. 

The next theorem gives us a useful criterion to determine when a set 
of elements of a vector space is a basis. 

Theorem 1.1. Let {v l9 .. ri v n } be a set of generators of a vector space V. 
Let {v l9 ... 9 v r } be a maximal subset of linearly independent elements. 
Then {v l9 ... 9 v r } is a basis of V 

Proof We must prove that v u ...,v r generate V. We shall first prove 
that each v ( (for i > r) is a linear combination of v l9 ...,v r . By hypothesis, 
given v i9 there exist x l9 ... 9 x r , yeK, not all 0, such that 


x 1 v 1 + • • • + x r v r + yv { = 0. 
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Furthermore, y / 0, because otherwise, we would have a relation of 
linear dependence for v l9 ...,v r . Hence we can solve for v i9 namely 

Xi x r 

V i— ~ V l + " * H V 9 

-y -y 

thereby showing that v x is a linear combination of v l9 ... 9 v r . 

Next, let v be any element of V. There exist c l9 . . . ,c n E K such that 

v = c 1 v 1 + ••• + c n v n . 

In this relation, we can replace each v t ( i > r) by a linear combination of 
v l9 ... 9 v r . If we do this, and then collect terms, we find that we have ex- 
pressed u as a linear combination of v l9 ... 9 v r . This proves that v l9 ... 9 v r 
generate V, and hence form a basis of V 

Let V, W be vector spaces over K. A map 

f\V^W 

is called a K -linear map, or a homomorphism of vector spaces, if / 

satisfies the following condition: For all xeK and v 9 v'eV we have 

f(v + v’) = f(v) + f(v’\ f(xv) = xf( v). 

Thus / is a homomorphism of V into W viewed as additive groups, 
satisfying the additional condition f(xv) = xf(v). We usually say “linear 
map” instead of “K-linear map”. 

Let f:V-*W and g: W -> U be linear maps. Then the composite 
go i s a Unear map. 

The verification is immediate, and will be left to the reader. 

Theorem 1.2. Let V 9 W be vector spaces, and {v 1 ,...,v n } a basis of V 
Let be elements of W. Then there exists a unigue linear map 

f: V -> W such that f (vi) = w t for all i. 

Proof Such a K-linear map / is uniquely determined, because if 
v = x x v x + ••• + x n v n 

is an element of V, with x f e K, then we must necessarily have 


/(») = *l/(»l) + • ■ • + 

= +--- + x n w m . 
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The map / exists, for given an element v as above, we define f(v) to be 
x x Wi + ■■■ + x n w n . We must then see that / is a linear map. Let 

v ' = yi v i + ••• + y n v n 

be an element of V with y,eK. Then 


v + v’ = (X! + y,)»i + ••• + (x n + y„)v„ . 

Hence 

f(v + v') = (*! + + • • • + (x„ + y n )w„ 

= XjWj +y 1 w 1 + ••• + x„w„ + y„w„ 

= f(v) + f(V). 

If cgK , then cv = cx 1 v 1 + ••■ + cx n v n , and hence 


f(cv ) = cx l w l + + cx n w n = cf(v). 


This proves that / is linear, and concludes the proof of the theorem. 

The kernel of a linear map is defined to be the kernel of the map 
viewed as an additive group-homomorphism. Thus Ker / is the set of 
veV such that f(v ) =0. We leave to the reader to prove: 

The kernel and image of a linear map are subspaces. 

Let f:V->W be a linear map. Then f is injective if and only if 
Ker / = 0. 


Proof Suppose / is injective. If / (r) = 0, then by definition and 
the fact that /(0) = 0 we must have v = 0. Hence Ker / = 0. Con- 
versely, suppose Ker / = 0. Let f(v 1 )= f(v 2 )• Then f(v 1 — v 2 ) = 0 so 
V\ — v 2 = 0 and v { = v 2 . Hence / is injective. This proves our assertion. 

Let /: V -> W be a linear map. If / is bijective, that is injective and 
surjective, then / has an inverse mapping 

g: W -+ V. 

If f is linear and bijective , then the inverse mapping g: W -> V is also a 

linear map. 

Proof. Let w 1? w 2 eW. Since / is surjective, there exist v u v 2 eV 
such that f(v 1 )=w 1 and f(v 2 ) = w 2 . Then f(v x + v 2 ) = + w 2 . By 

definition of the inverse map, 


0 (w t + w 2 ) = + v 2 = 0 (Wj) + g(w 2 ). 
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We leave to the reader the proof that g(cw) = cg(w) for ceK and we W. 
This concludes the proof that g is linear. 

As with groups, we say that a linear map /: V — > W is an isomorphism 
(i.e. a vector space isomorphism) if it has a linear inverse, i.e. there exists a 
linear map g: W — > V such that g o / is the identity of K, and / o g is the 
identity of W. The preceding remark shows that a linear map is an isomor- 
phism if and only if it is bijective. 

Let \\ W be vector spaces over the field K. We let 

Horn K (V, W) = set of all linear maps of V into W. 

Let f g: V ->W be linear maps. Then we can define the sum / + g, j ust 
as we define the sum of any mappings from a set into W. Thus by 
definition 

(/ + g)(v) = f(v) + g(v). 

If ceK we define cf to be the map such that 

(cf)(v) = cf(v). 

With these definitions, it is then easily verified that 

Hom A (K W) is a vector space over K. 

We leave the steps in the verification to the reader. In case V = W, we 
call the homomorphisms (or /(-linear maps) of V into itself the endo- 
morphisms of V, and we let 

End K (F) = Hom K (F, V). 


V, §1. EXERCISES 

1. Show that the following vectors are linearly independent, over R and over C. 

(a) (1, 1, 1) and (0, 1, -1) (b) (l,0)and(l, 1) 

(c) (- 1, 1, 0) and (0, 1, 2) (d) (2, - 1) and (1, 0) 

(e) fa 0) and (0,1) (f) (1, 2) and (1, 3) 

(g) (1, 1, 0), (1, 1, 1) and (0, 1, -1) (h) (0, 1, 1), (0, 2, 1) and (1, 5, 3) 

2. Express the given vector X as a linear combination of the given vectors A, B 
and find the coordinates of X with respect to A, B. 

(a) X = (1, 0), A = (1, 1), B = (0, 1) 

(b) X = (2, 1), A = (1, — 1), B = (1, 1) 

(c) AT =(1,1), /I = (2, IX B- (-1,0) 

(d) X = (4, 3), A = (2, 1 ), B = ( — 1 , 0) 

(You may view the above vectors as elements of R 2 or C 2 . The coordinates 
will be the same.) 


[V, §2] 


DIMENSION OF A VECTOR SPACE 


185 


3. Find the coordinates of the vector X with respect to the vectors A, B , C. 

(a) AT = (1, 0, 0), A = (1, 1, 1), B = ( — 1, 1,0), C = (1, 0, - 1) 

(b) X = (1, 1, 1), A = (0, 1, — 1), B = (1, 1, 0), C = (1, 0, 2) 

(c) AT = (0, 0, 1), A = (1, 1, 1), B = (p\, 1,0), C = (1,0, -1) 

4. Let ( a , b) and (c, d) be two vectors in K 2 . If ml — be = 0, show that they are 

linearly dependent. If ml — be ^ 0, show that they are linearly independent. 

5. Prove that 1, yj 2 are linearly independent over the rational numbers. 

6. Prove that 1, ^3 are linearly independent over the rational numbers. 

7. Let a be a complex number. Show that a is rational if and only if 1, a are 
linearly dependent over the rational numbers. 


V, §2. DIMENSION OF A VECTOR SPACE 

The main result of this section is that any two bases of a vector space 
have the same number of elements. To prove this, we first have an inter- 
mediate result. 

Theorem 2.1. Let V be a vector space over the field K. Let {v u ...,v m } 
be a basis of V over K. Let w l5 ...,w w be elements of V, and assume 
that n > m. Then w w„ are linearly dependent. 

Proof Assume that w l5 ...,w n are linearly independent. Since 
{v u ...,v m } is a basis, there exist elements a 1 ,...,a m eK such that 


Wi = a 1 v 1 + ••• + a m v m . 


By assumption, we know that w x ^ 0, and hence some a t ^ 0. After 
renumbering if necessary, we may assume without loss of 

generality that (say) a i # 0. We can then solve for v u and get 

a \V i = Wi - a 2 v 2 a m v m , 

»i = «i _1 w i - al l a 2 v 2 a\ l a m v m . 

The subspace of V generated by w 1? v 2 ,.-.,v m contains v u and hence 
must be all of V since v u v 2 ,...,v m generate V. The idea is now to con- 
tinue our procedure stepwise, and to replace successively v 2 ,v 3 ,... by 
w 2 , w 3 ,... until all the elements v u ...,v m are exhausted, and vv 1 ,...,w m 
generate V. Let us now assume by induction that there is an integer 
r with 1 ^r<m such that, after a suitable renumbering of 
the elements w l9 ...,w r , v r+l9 ...,v m generate V. There exist elements 
b u ...,b r , c r+l ,...,c m in K such that 


W’r+l = b 1 w l + ••• + b r w r + c r+1 v r+1 + ••• + c m v m . 
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We cannot have Cy = 0 for j = r+ for otherwise, we get a rela- 

tion of linear dependence between w 1? . ..,w r + 1 , contradicting our assump- 
tion. After renumbering v r+ 1 ,...,v m if necessary, we may assume without 
loss of generality that (say) c r+l ^0. We then obtain 

Cr+ l»r+l = w r + l ~ b l w l b rWr-C r + 2V r + 2 C m V m . 

Dividing by c r + 1 , we conclude that v r+l is in the subspace generated by 
w ls ...,w r+1 , v r + 2 ,'",v m . By our induction assumption, it follows that 
w l5 ...,w r+1 , v r + 2 ,.-.,v m generate V. Thus by induction, we have proved 
that w l5 ...,w m generate V If we write 

W m +1 = X 1 w 1 + 

with x, e K , we obtain a relation of linear dependence 
w w+ i - x m w m = 0, 


as was to be shown. 

Theorem 2.2. Let V be a vector space over K , and let {v l ,...,v n } and 
{w 1 ,...,w m } be two bases of V Then m = n. 

Proof. By Theorem 2.1, we must have n ^ m and m ^ n, so m = n. 

If a vector space has one basis, then every other basis has the same 
number of elements. This number is called the dimension of V (over K). 
If V is the zero vector space, we define V to have dimension 0. 

Corollary 2.3. Let V be a vector space of dimension n , and let W be a 
subspace containing n linearly independent elements. Then W = V. 

Proof. Let veV and let w u .,.,w n be linearly independent elements 
of W. Then w 1? . . . ,w„, i; are linearly dependent, so there exist a, 
b l9 . . . ,b n G K not all zero such that 

av + feiWj + ••• + b n w n = 0. 

We cannot have a = 0, otherwise w u .-. 9 w n are linearly dependent. Then 

v — —a~ 1 b 1 w — •• • — a~ i b n w n 

is an element of W. This proves V ci W, so V = W. 

Theorem 2.4. Let /: V -► W be a homomorphism of vector spaces over 
K. Assume that V, W have finite dimension and that dim V = dim W. 
If Ker / = 0 or if Imf = W then f is an isomorphism. 
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Proof. Suppose Ker/ = 0. Let {v u ...,v n } be a basis of V. Then 
are linearly independent, for suppose c l ,...,c n eK are such 

that 

Ci/(^i) + ••• + C n f(v n ) = 0. 

Then 

f( C l v 1 + "* + C n V n ) = 

and since / is injective, we have + c n v n = 0. Hence c f = 0 for 

i= 1 , ...,n since { e 19 . . . is a basis of V Hence Im / is a subspace of 
IT of dimension n, whence Im/ = IT by Corollary 2.3. Hence / is also 
surjective, so / is an isomorphism. 

We leave the other case to the reader, namely the proof that if / is 
surjective, then / is an isomorphism. 


Let V be a vector space. We define an automorphism of V to be an 
invertible linear map 


f-v^v 


of V with itself. We denote the set of automorphisms of V by 
Aut(K) or GL(F). 


The letters GL stand for “General Linear”. 


Theorem 2.5. The set Aut(F) is a group. 

Proof. The multiplication is composition of mappings. Such composi- 
tion is associative, we have seen that the inverse of a linear map is 
linear, and the identity is linear. Thus all the group axioms are satisfied. 


The group Aut(L) is one of the most important groups in mathe- 
matics. In the next chapter we shall study it for finite dimensional vector 
spaces in terms of matrices. 


V, §2. EXERCISES 

1. Let V be a finite dimensional vector space over K. Let IT be a subspace. Let 

{w!,...,^} be a basis of W. Show that there exist elements w m+1 , ...,w„ in V 
such that is a basis of V. 

2. If / is a linear map, /: V -> V', prove that 

dim V = dim Im/ 4- dim Ker /. 

3. Let t/, W be subspaces of a vector space V. 

(a) Show that U + IT is a subspace. 
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(b) Define U x W to be the set of all pairs (w, w) with ue U and we W. Show 
how U x W is a vector space. If U, W are finite dimensional, show that 

dim (U x W) = dim U 4- dim W. 

(c) Prove that dim U + dim W = dim (U + W) + dim(l/ n W). [Hint: Con- 
sider the linear map f:U x W -> U + W given by f(u, w) = u — w.] 

V, §3. MATRICES AND LINEAR MAPS 

Although we expect that the reader will have had some elementary linear 
algebra previously, we recall here some of these basic results to make 
the present book self contained. We go rapidly. 

An m x n matrix A = (a (J ) is a doubly indexed family of elements a 
in a field K , with i = 1 and j = l,...,n. A matrix is usually written 

as a rectangular array 



The elements a tj are called the components of A. If m = n then A is 
called a square matrix. 

As a matter of notation, we let: 

Mat mXn (K) = set of m x n matrices in K, 

Mat n (K) or M n (K) = set of n x n matrices in K. 

Let A = (a u ) and B = (b u ) be m x n matrices in K. We define their 
sum to be the matrix whose (/-component is 

a ij T bij. 

Thus we take the sum of matrices componentwise. Then Mat WXn (/C) is 
an additive group under this addition. The verification is immediate. 

Let ceK. We define cA = (ca i} \ so we multiply each component of A 
by c. Then it is also immediately verified that Mat m * n (K) is a vector 
space over K. The zero element is the matrix 



having components equal to 0. 
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Let A = ( dij ) be an m x n matrix and B = ( b jk ) be an n x r matrix. 
Then we define the product AB to be the m x r matrix whose ik- compo- 
nent is 

n 

X a ijbjk’ 
j= 1 

It is easily verified that this multiplication satisfies the distributive law: 

A(B + C)= AB + AC and (A + B)C = AC + BC 

provided that the formulas make sense. For the first one, B and C must 
have the same size, and the products AB , AC must be defined. For the 
second one, A and B must have the same size, and AC, BC must be 
defined. Furthermore, if (AB)C is defined, then (AB)C — A(BC). 

Given n we define the unit or identity n x n matrix to be 



in other words, /„ is a square n x n matrix, having components 1 on the 
diagonal, and components 0 otherwise. From the definition of multipli- 
cation, if A is an m x n matrix, we get: 

l m A — A and AI n = A. 

Let M n (K ) denote the set of all n x n matrices with components in K. 

Then M n (K ) is a ring under the above addition and multiplication of 

matrices. 

This statement is merely a summary of properties which we have 
already listed. 

There is a natural map of K in M n (K), namely 



which sends an element ceK on the diagonal matrix having diagonal 
components equal to c and otherwise 0 components. We call cl n a scalar 
matrix. The map 

c^cl n 

is an isomorphism of K onto the K-vector space of scalar matrices. 
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We now describe the correspondence between matrices and linear 
maps. 

Let A = (Aij) be an m x n matrix in a field K . Then A gives rise to a 
linear map 


Theorem 3.1. The association A^L a is an isomorphism between the 
vector space of m x n matrices and the space of linear maps K n — > K m . 

Proof If A, B are m x n matrices and L A = L B then A = B because if 
E j is the 7*-th unit vector 


with 1 in the 7-th component and 0 otherwise, then AE j = A j is the 7-th 
column of A , so if AE j = BE j for all j = 1, ...,n we conclude that A = B. 

Next we have to prove that A\— > L A is surjective. Let L: K n -> K m be 
an arbitrary linear map. Let { L/ 1 , ... ,L/ m } be the unit vectors in K m . 
Then there exist elements e K such that 


L a : K n -> K m by X AX = L A (X). 



0 


E j = 


0 



m 


L(E J ) = X fly I/'- 


1=1 


Let /l = [a if If 


n 


x = I XjE j , 


j= 1 


then 


n n m 


L(X)= X I I 


j= 1 



This means that L — L A and concludes the proof of the theorem. 
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Next we give a slightly different formulation. Let V be a vector space 
of dimension n over K. Let {v 1 ,...,v n } be a basis of K Recall that we 
have an isomorphism 

K n -> V given by (x 1 ,...,x II )i-*x 1 t> 1 + ••• 4- x n v n . 


Now let V , W be vector spaces over K of dimensions n , m respective- 
ly. Let 


L: V-> W 


be a linear map. Let {v { ,...,v n } and {w l3 . ..,w m } be bases of V and W 
respectively. Let a^eK be such that 


L(Vj) = X AyW,. 

i = 1 

Then the matrix A = (a^) is said to be associated to L with respect to the 
given bases. 

Theorem 3.2. The association of the above matrix to L gives an iso- 
morphism between the space of m x n matrices and the space of linear 
maps Hom K (V, W). 

Proof The proof is similar to the proof of Theorem 3.1 and is left to 
the reader. 

Basically what is happening is that when we represent an element of 
V as a linear combination 


V = x lVl + + x„v n , 

and view 



as its coordinate vector, then L(v) is represented by AX in terms of the 
coordinates. 


V, §3. EXERCISES 

1. Exhibit a basis for the following vector spaces: 

(a) The space of all m x n matrices. 

(b) The space of symmetric n x n matrices. A matrix A = (a tJ ) is said to be 
symmetric if a u = a j{ for all i, j. 
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(c) The space of n x n triangular matrices. A matrix A = is said to be 
upper triangular if ay = 0 whenever j < i. 

2. If dim V = n and dim W = m, what is dim Hom^K, W)2 Proof? 

3. Let R be a ring. We define the center Z of R to be the subset of all elements 
zeR such that zx = xz for all xeR. 

(a) Show that the center of a ring R is a subring. 

(b) Let R = Mat nx „(K) be the ring of n x n matrices over the field K. Show 
that the center is the set of scalar matrices c/, with ceK. 

V, §4. MODULES 

We may consider a generalization of the notion of vector space over a 
field, namely module over a ring. Let R be a ring. By a (left) module 
over R , or an 7?-module, one means an additive group M, together with 
a map R x M -> M, which to each pair (x, r) with xeR and veM asso- 
ciates an element xv of M, satisfying the four conditions: 

MOD 1. If e is the unit element of R , then ev = v for all veM. 

MOD 2. If xeR and v , weM, then x(v + w) = xv + xw. 

MOD 3. If x, yeR and veM , then (x + y)v = xv + jw. 

MOD 4. If x, yeR and veM , r/ien (xy)i; = x(yt>). 

Example 1. Every left ideal of R is a module. The additive group 
consisting of 0 alone is an K-module for every ring R. 

As with vector spaces, we have Or = 0 for every veM. (Note that the 

0 in Or is the zero element of R , while the 0 on the other side of the 

equation is the zero element of the additive group M. However, there 
will be no confusion in using the same symbol 0 for all zero elements 
everywhere.) Also, we have ( — e)v= — r, with the same proof as for 
vector spaces. 

Let M be a module over R and let TV be a subgroup of M. We say 

that N is a submodule of M if whenever veN and xeR then xveN. It 

follows that N is then itself a module. 

Example 2. Let M be a module and v u ...,v n elements of M. Let N 
be the subset of M consisting of all elements 


x i v i + ••• + x n v n 

with x f e R. Then N is a submodule of M. Indeed, 


0 = 0r x + * + Or, 


[V, §4] 


MODULES 


193 


so OeN. If y l9 ...,y n eR, then 

*\Vi + * * * + x n v n + y t v x + • ■ • + y H v n = (x { + y l )v i + • • • + (x n + y n )v n 
is in AT. Finally, if ceR, then 


c(x 1 v 1 + • • • + x n v n ) = cx 1 v 1 + • • • + cx H v„ 

is in N, so we have proved that N is a submodule. It is called the sub- 
module generated by v v n , and we call v l9 ...,v n generators for N. 

Example 3. Let M be an (abelian) additive group, and let R be a 
subring of End(M). (We defined End(M) in Chapter III, §1 as the ring 
of homomorphisms of M into itself.) Then M is an K-module, if to each 
feR and veM we associate the element fv = f(y)eM. The verification 
of the four conditions for a module is trivially carried out. 

Conversely, given a ring R and an K-module M, to each xeR we as- 
sociate the mapping M -> M such that a x (v) = xv for veM. Then the 
association 

xh->;. x 

is a ring-homomorphism of R into End (A/), where End (M) is the ring of 
endomorphisms of M viewed as additive group. This is but another 
way of formulating the four conditions MOD 1 through MOD 4. For 
instance, MOD 4 in the present notation can be written 

A X y ^ ^ xy ^ x ° ^ y 

since the multiplication in End (A/) is composition of mappings. 

Warning. It may be that the ring-homomorphism x i— ► a x is not injec- 
tive, so that in general, when dealing with a module, we cannot view R 
as a subring of End(M). 

Example 4. Let us denote by K n the set of column vectors , that is 
column n-tuples 

X = I . I with components x ( eK. 

vv 

Then K n is a module over the ring M n (K). Indeed, matrix multiplication 
defines a mapping 


M n (K) x K n K n 


by 


(A,X)h^AX. 
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This multiplication satisfies the four axioms MOD 1 through MOD 4 for 
a module. This is one of the most important examples of modules in 
mathematics. 

Let R be a ring, and let M, M be K-modules. By an /^-linear map 
(or /^-homomorphism) /: M -> Af' one means a map such that for all 
xeR and v, we M we have 

f(xv) = xf (v), f(v + w) = f(v) + f(w). 

Thus an /^-linear map is the generalization of a K-linear map when the 
module is a vector space over a field. 

The set of all K-linear maps of M into AT will be denoted by 
Horn r (M, M'). 

Example 5. Let M , A/', M” be K-modules. If 

and g: M’ -> M" 

are K-linear maps, then the composite map g ° / is K-linear. 

In analogy with previous definitions, we say that an K-homomorphism 
/ : M -> M' is an isomorphism if there exists an K-homomorphism 
g: Af -> M such that g •/ and fog are the identity mappings of M and 
M', respectively. We leave it to the reader to verify that: 

An R- homomorphism is an isomorphism if and only if it is bijective. 

As with vector spaces and additive groups, we have to consider very 
frequently the set of K-linear maps of a module M into itself, and it is 
convenient to have a name for these maps. They are called R-e ndo- 
morphisms of M. The set of K-endomorphisms of M is denoted by 

End r (M). 

We often suppress the prefix R - when the reference to the ring R is 
clear. 

Let f:M^>M' be a homomorphism of modules over R. We define 
the kernel of / to be its kernel viewed as a homomorphism of additive 
groups. 

In analogy with previous results, we have: 

Let f:M^>M' be a homomorphism of R-modules. Then the kernel of f 

and the image of f are submodules of M and M' respectively. 


[V, §4] 


MODULES 


195 


Proof. Let E be the kernel of /. Then we already know that E is an 
additive subgroup of M. Let veE and xe R. Then 


f(xv) = xf ( v ) = xO = 0, 


so xveE, and this proves that the kernel of / is a submodule of M. We 
already know that the image of / is a subgroup of M'. Let v f be in the 
image of /, and xe R. Let v be an element of M such that 


m = 


Then f(xv) = xf(y) = xv ' also lies in the image of M, which is therefore a 
submodule of M', thereby proving our assertion. 


Example 6. Let R be a ring, and M a left ideal. Let y e M. The map 

r y : M -> M 


such that 

r y (x) = xy 

is an K-linear map of M into itself. Indeed, if xeM then xyEM since 
yEM and M is a left ideal, and the conditions for K-linearity are refor- 
mulations of definitions. For instance, 

r/x-i + x 2 ) = (*! + x 2 )y = x t y + x 2 y 
= r y (x 0 + r y (x 2 ). 


Furthermore, for zeR, xeM , 


r y (zx ) = zxy = zr y (x). 


We call r y right multiplication by y. Thus r y is an K-endomorphism of 
M. 

Observe that any abelian group can be viewed as a module over the 
integers. Thus an K-module M is also a Z-module, and any R-q ndo- 
morphism of M is also an endomorphism of M viewed as abelian group. 
Thus End R (M) is a subset of End(M) = End Z (M). 

In fact, End R (M) is a subring of End(AZ), so that End R (M) is itself a 
ring. 
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The proof is routine. For instance, if /, g eEnd R (A/), and xeR, veM , 
then 

(/ + g)(*v) = f(xv) + g(xv) 

= xf(v) + xg(v) 

= x(f(v) + g(v)) 

= x(f + g)(v). 

So / + g g End K (M). Equally easily, 

(f°g)(xv) =f(g(xv)) = f((xg(v)) = xf(g(v)). 

The identity is in End R (A/). This proves that End R (M) is a subring of 
End z (M). 

We now also see that M can be viewed as a module over End R (M) 
since M is a module over End z (M) = End(M). 

Let us denote End R (M) by R\M ) or simply R' for clarity of notation. 
Let feR' and xeR. Then by definition, 

f(xv) = xf(v). 


and consequently 


foX x {v) = k x of(v). 

Hence is an R '-linear map of M into itself, i.e. an element of 
End r (M). The association 


X \ x i — > X x 

is therefore a ring-homomorphism of R into End R (M), not only into 
End(M). 

Theorem 4.1. Let R be a ring , and M an R-module. Let J be the set of 
elements xeR such that xv = 0 for all veM. Then J is a two-sided 
ideal of R. 

Proof If x, yeJ, then (x + y)v = xv + yv = 0 for all veM. If a eR, 
then 


(< ax)v = a(xv) = 0 and (xa)v = x(av) = 0 


for all veM. This proves the theorem. 
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We observe that the two-sided ideal of Theorem 4.1 is none other 
than the kernel of the ring-homomorphism 

X I — > A, x 

described in the preceding discussion. 

Theorem 4.2 (Wedderburn-Rieffel). Let R be a ring , and L a non-zero 
left ideal , viewed as R-module. Let R = End R (L), and R" = End R \L). 
Let 

X:R^R" 

be the ring-homomorphism such that X x (y) = xy for xe R and yeL. As- 
sume that R has no two-sided ideals other than 0 and R itself Then X 
is a ring-isomorphism. 

Proof (Rieffel) The fact that X is injective follows from Theorem 4.1, 
and the hypothesis that L is non-zero. Therefore, the only thing to 
prove is that X is surjective. By Example 6 of Chapter III, §2, we know 
that LR is a two-sided ideal, non-zero since R has a unit, and hence 
equal to R by hypothesis. Then 

X(R) = X(LR) = X(L)X(R ). 

We now contend that X(L) is a left ideal of R To prove this, let feR ", 
and let xeL. For all yeL , we know from Example 6 that r y is in R\ 
and hence that 

f°r y = r y °f- 

This means that f(xy) = f(x)y. We may rewrite this relation in the form 

f°h(y) = -W y)- 

Hence f*X x is an element of X(L\ namely X f(x) . This proves that X(L) is 
a left ideal of R". But then 

R”X(R) = R"X(L)X(R) = X(L)X(R) = X(R). 

Since X(R) contains the identity map, say e , it follows that for every 
/ e R'\ the map /° e = f is obtained in X(R\ i.e. R" is contained in X(R ), 
and therefore R " = X(R ), as was to be proved. 

The whole point of Theorem 4.2 is that it represents R as a ring of 
endomorphisms of some module, namely the left ideal L. This is impor- 
tant in the following case. 
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Let D be a ring. We shall say that D is a division ring if the set of 
non-zero elements of D is a multiplicative group (and so in particular, 
1 # 0 in the ring). Note that a commutative division ring is what we 
called a field. 

Let R be a ring, and M a module over R . We shall say that M is a 
simple module if A/ # {0}, and if M has no submodules other than {0} 
and M itself. 


Theorem 4.3 (Schur’s Lemma). Let M be a simple module over the ring 
R. Then End R (M) is a division ring. 


Proof. We know it is a ring, and we must prove that every non-zero 
element / has an inverse. Since / # 0, the image of / is a submodule of 
M # 0 and hence is equal to all of M, so that / is surjective. The kernel 
of / is a submodule of M and is not equal to M, so that the kernel of / 
is 0, and / is therefore injective. Hence / has an inverse as a group- 
homomorphism, and it is verified at once that this inverse is an R- 
homomorphism, thereby proving our theorem. 


Example 7. Let R be a ring, and L a left ideal which is simple as an 
K-module (we say then that L is a simple left ideal). Then End K (L) = D 
is a division ring. If it happens that D is commutative, then under the 
hypothesis of Theorem 4.2, we conclude that R % End D (L) is the ring of 
all D-linear maps of L into itself, and L is a vector space over the field 
D. Thus we have a concrete picture concerning the ring R. See Exercises 
23 and 24. 


Example 8. In Exercise 21 you will show that the ring of endomor- 
phisms of a finite dimensional vector space satisfies the hypothesis of 
Theorem 4.2. In other words, End K (F) has no two-sided ideal other than 
{0} and itself. Furthermore, V is simple as an End*( F)-module (Exercise 
18). Theorem 4.2 gives some sort of converse to this, and shows that it 
is a typical example. 


Just as with groups, one tries to decompose a module over a ring into 
simple parts. In Chapter II, §3, Exercises 15 and 16, you defined the 
direct sum of abelian groups. We have the same notion for modules as 
follows. 

Let M u ...,M q be modules over R. We can form their direct product 
n M i i = M x x ■■■ x M q 

i = 1 
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consisting of all ^-tuples (uj , . . . , v q ) with V[ e Mi. This direct product is the 
direct product of M l9 ...,M q viewed as abelian groups, and we can define 
the multiplication by an element c e R componentwise, that is 

c{v u ...,v q ) = (cv u ...,cv q ). 

It is then immediately verified that the direct product is an K-module. 

On the other hand, let M be a module, and let M l9 ...,M q he 
submodules. We say that M is the direct sum of M ^ ... ,M q if every 
element ve M has a unique expression as a sum 

v = v\ -f 1 - v q with Vi e Mi. 

If M is such a direct sum, then we denote this sum by 

<3 

M = or also @M t . 

i - 1 


For any module M with submodules M u ...,M q there is a natural 
homomorphism from the direct product into M, namely 

f] Mi -► M given by (v u . . . 9 v q ) \-+v x +■■• + !>,. 

i — 1 

Proposition 4.4. Let M be a module. 

(a) Let M 2 be submodules. We have M — M X ®M 2 if and only if 
M = Mj + M 2 and M { n M 2 = {0}. 

(b) The module M is a direct sum of submodules M u ..,,M q if and 
only if the natural homomorphism from the product ]~| into M is 
an isomorphism. 

(c) The sum /s a direct sum of if *nd only if given a 

relation 


v t -f • ■ • + v q = 0 with v { e 
we have v t = 0 for i = 1, . . . ,q. 

Proof. The proof will be left as a routine exercise to the reader. Note 
that condition (c) is similar to a condition of linear independence. 

In §7 you will see an example of a direct sum decomposition for 
modules over principal rings, similar to the decomposition of an abelian 
group. 
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V, §4. EXERCISES 

1. Let R be a ring. Show that R can be viewed as a module over itself, and has 
one generator. 

2. Let R be a ring and M an R-module. Show that Horn r (R, M ) and M are 
isomorphic as additive groups, under the mapping /»-->/( 1). 

3. Let E, F be R-modules. Show that Horn *(£, F) is a module over End*(£), 
the operation of the ring End R (F) on the additive group Hom*(£, F) being 
composition of mappings. 

4. Let £ be a module over the ring R, and let L be a left ideal of R. Let LE be 
the set of all elements x\ vi + ■ • • + x„v„ with Xi e L and t>,- e £. Show that LE 
is a submodule of £. 

5. Let R be a ring, £ a module, and L a left ideal. Assume that L and £ are 
simple. 

(a) Show that LE = £ or LE = {0}. 

(b) Assume that LE = E. Define the notion of isomorphism of modules. 
Prove that L is isomorphic to £ as R-module. [Hint: Let v 0 eE be an 
element such that Lv 0 ^ {0}. Show that the map x\— >xv 0 establishes an 
R-isomorphism between L and £.] 

6. Let R be a ring and let £, £ be R-modules. Let a: E -> £ be an isomorph- 
ism. Show that End*(£) and End*(£) are ring-isomorphic, under the map 

/hMjo/oa " 1 

for /eEnd*(£). 

7. Let £, £ be simple modules over the ring R. Let /:£->£ be a homomorph- 
ism. Show that / is 0 or / is an isomorphism. 

8. Verify in detail the last assertion made in the proof of Theorem 4.3. 

Let R be a ring, and £ a module. We say that £ is a free module if there 
exist elements v lt ...,v n in £ such that every element veE has a unique ex- 
pression of the form 

v = x x v t + + x n v„ 

with j, gR. If this is the case, then {i^, . . . is called a basis of £ (over R). 

9. Let £ be a free module over the ring R, with basis {i^, .. Let £ be a 
module, and w u ...,w n elements of £. Show that there exists a unique homo- 
morphism /:£->£ such that / (v t ) = w, for i = 1, . . . ,n. 

10. Let R be a ring, and S a set consisting of n elements, say Let £ be 

the set of mappings from S into R. 

(a) Show that £ is a module. 

(b) If xe R, denote by xs, the function of S into R which associates ^ to s ( 

and 0 to Sj for j ^ i. Show that £ is a free module, that {lsj, . . .,ls„} is a 

basis for £ over R, and that every element ve F has a unique expression 

of the form -f • •• + x n s n with x ( e R. 
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11. Let K be a field, and R = R[X] the polynomial ring over K. Let J be the 
ideal generated by X 2 . Show that R/J is a E-space. What is its dimension? 

12. Let K be a field and R = K[X ] the polynomial ring over K. Let f(X) be a 

polynomial of degree d > 0 in E[X]. Let J be the ideal generated by f(X). 

What is the dimension of R/J over E? Exhibit a basis of R/J over K. Show 
that R/J is an integral ring if and only if / is irreducible. 

13. If R is a commutative ring, and E , F are modules, show that Hom*(£, F ) is 
an R-module in a natural way. Is this still true if R is not commutative? 

14. Let K be a field, and R a vector space over K of dimension 2. Let { e , a} be 

a basis of R over K. If a , b, c, d are elements of E, define the product 

(i ae + bu)(ce + da) = ace + (be + ad)u. 

Show that this product makes R into a ring. What is the unit element? 
Show that this ring is isomorphic to the ring E[X]/(X 2 ) of Exercise 11. 

15. Let the notation be as in the preceding exercise. Let f(X) be a polynomial 
in E[X]. Show that 


f(ae 4- u) = f(a)e + f'(a)u, 
where /' is the formal derivative of /. 

16. Let £ be a ring, and let £', £, £ be R-modules. If /:£'->£ is an R-homo- 
morphism, show that the map <p i— ►/« <p is a Z-homomorphism 

Horn r (F, £') -► Horn R (F, £), 

and is an R-homomorphism if R is commutative. 

17. A sequence of homomorphisms of abelian groups 

/ i4b4c 

is said to be exact if Im / = Ker g. Thus to say that 0 ► -t B is exact 
means that / is injective. Let R be a ring. If 

o _► £' 4 £ 4 £" 

is an exact sequence of R-modules, show that for every R-module £ 

0 -> Hom^E, £') -► Hom^E, £) -► Hom*(E, £") 
is an exact sequence. 

18. Let V be a finite dimensional vector space over a field K . Let R = End x (F). 
Prove that V is a simple R-module. [Hint: Given v, we V and v ^ 0, w # 0 
use Theorem 1.2 to show that there exists /eR such that f (v) = w, so 
Rv = K] 
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19. Let R be the ring of n x n matrices over a field 
matrices of type 


K. 


Show that the set of 


la, 0 

U 0 


having components equal to 0 except possibly on the first column, is a left 
ideal of R. Prove a similar statement for the set of matrices having compo- 
nents 0 except possibly on the j - th column. 

20. Let A, B be n x n matrices over a field K , all of whose components are equal 
to 0 except possibly those of the first column. Assume A ^ 0. Show that 
there exists an n x n matrix C over K such that CA = B. Hint : Consider 
first a special case where 


A = 



0 

0 

0 



21. Let V be a finite dimensional vector space over the field K. Let R be the 
ring of K- linear maps of V into itself. Show that R has no two-sided ideals 
except {0} and R itself. [Hint: Let 4 e R, A ^ 0. Let e V, v, # 0, and 
Av, ^ 0. Complete v, to a basis {v,, . . . of V. Let {w l5 ...,w„} be arbitrary 
elements of V. For each i = 1 there exists R,eR such that 

B^i = v, and B t Vj = 0 if j ^ i, 

and there exists C^R such that C t A v, = w, (justify these two existence state- 
ments in detail). Let F = C,AB X + ■■■ + C n AB n . Show that F(v { ) = vv f for all 
/= 1 Conclude that the two-sided ideal generated by A is the whole 
ring R.] 

22. Let V be a vector space over a field K and let R be a subring of End A (F) 
containing all the scalar maps, i.e. all maps cl with ceK. Let L be a left 

ideal of R. Let LV be the set of all elements A H + A n v„ with A { eL 

and v { eV, and all positive integers n. Show that LV is a subspace W of V 
such that RW a W. A subspace having this property is called R-invariant. 

23. Let D be a division ring containing a field K as a subfield. We assume that 
K is contained in the center of D. 

(a) Verify that the addition and the multiplication in D allow us to view D 
as a vector space over K. 

(b) Assume that D is finite dimensional over K. Let oleD. Show that there 
exists a polynomial f(t)EK[t] of degree ^1 such that /( a) = 0. [Hint: 
For some n , the powers 1, a, a 2 ,..., a" must be linearly dependent over 
*.] 
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For the rest of the exercises, remember Corollary 1.8 of Chapter IV. 

(c) Assume that K is algebraically closed. Let D be a finite dimensional divi- 
sion ring over K as in parts (a) and (b). Prove that D = K, in other 
words, show that every element of D lies in K. 

24. Let R be a ring containing a field X as a subfield with K in the center of 
R. Assume that K is algebraically closed. Assume that R has no two-sided 
ideal other than 0 and R. We also assume that R is of finite dimension > 0 
over K. Let L be a left ideal of R, of smallest dimension > 0 over K. 

(a) Prove that End K (L) = K (i.e. the only R-linear maps of L consist of mul- 
tiplication by elements of K). [Hint: Cf. Schur’s lemma and Exercise 23.] 

(b) Prove that R is ring-isomorphic to the ring of K-linear maps of L into 
itself. [Hint: Use Wedderburn-Rieffel.] 


V, §5. FACTOR MODULES 

We have already studied factor groups, and rings modulo a two-sided 
ideal. We shall now study the analogous notion for a module. 

Let R be a ring, and M an R-module. By a submodule N we shall 
mean an additive subgroup of M which is such that for all xeR and 
veN we have xveN. Thus N itself is a module (i.e. R-module). 

We already know how to construct the factor group M/N. Since M is 
an abelian group, N is automatically normal in M, so this is an old 
story. The elements of the factor group are the cosets v + N with veM. 
We shall now define a multiplication of these cosets by elements of R. 
This we do in the natural way. If xeR, we define x(v + N) to be the 
coset xv + N. If v\ is another coset representative of v + N, then we can 
write v 1 = v + w with weN. Hence 

xv l = xv + xw, 

and xweN. Consequently xv 1 + N = xv 4- N. Thus our definition is in- 
dependent of the choice of representative v of the coset v + N. It is now 
trivial to verify that all the axioms of a module are satisfied by this mul- 
tiplication. We call M/N the factor module of M by N, and also M 

modulo N. 

We could also use the notation of congruences. If v , v' are elements of 
M, we write 


v = v' (mod N) 

to mean that v — v f e N. This amounts to saying that the cosets v + JV 
and t/ + N are equal. Thus a coset v + N is nothing but the congruence 
class of elements of M which are congruent to v mod N. We can 
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rephrase our statement that the multiplication of a coset by x is well 
defined as follows: If u = u' (mod IV), then for all xeR, we have 
xv = xv' (mod N). 

Example 1. Let V be a vector space over the field K. Let W be a 
subspace. Then the factor module V/W is called the factor space in this 
case. 

Let M be an /^-module, and N a submodule. The map 

/: M -► M/N 

which to each veM associates its congruences class f(v) = v + N is 
obviously an K-homomorphism, because 

f(xv) = xv + N = x(v + N) = xf(v) 

by definition. It is called the canonical homomorphism. Its kernel is N. 

Example 2. Let V be a vector space over the field K. Let IT be a 
subspace. Then the canonical homomorphism f:V->V/W is a linear 
map, and is obviously surjective. Suppose that V is finite dimensional 
over K , and let W r be a subspace of V such that V is the direct sum, 
V — IT© IT'. If ve V, and we write v = w + w' with weW and w'e IT', 
then f(v) = /(w) + /(w') = /(w'). Let us just consider the map / 
on IT', and let us denote this map by /'. Thus for all w' e IT' we have 
/'(w') = f{W) by definition. Then /' maps IT' onto V/W, and the kernel 
of /' is {0}, because ITn IT' = {0}. Hence /' : IT' -> V/W is an isomor- 
phism between the complementary subspace IT' of W and the factor space 
V/W. We have such an isomorphism for any choice of complemen- 
tary subspace IT'. 


V, §5, EXERCISES 

1. Let V be a finite dimensional vector space over the field K, and let IT be a 
subspace. Let {v t , . . . ,v,} be a basis of IT, and extend it to a basis {i; l5 . . . ,v n } 
of V. Let f:V -* V/W be the canonical map. Show that 


{/0v + .)>•••> /(”„)} 


is a basis of V/W. 

2. Let V and IT be as in Exercise 1. Let 


A : V -> V 
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be a linear map such that AW a W, i.e. Awe W for all weW. Show how to 
define a linear map 


by defining 


A : V/W^> V/W 9 


A( v + W) = Av+W. 


(In the congruence terminology, if v = v' (mod W ), then Av = Av' (mod W).) 
Write v instead of v + W. We call A the linear map induced by A on the fac- 
tor space. 

3. Let V be the vector space generated over R by the functions 1, £, t 2 , e*, te\ 
t 2 e x . Let W be the subspace generated by 1, £, £ 2 , e\ te\ Let D be the deriva- 
tive. 

(a) Show that D maps W into itself. 

(b) What is the linear map D induced by D on the factor space V /W1 

4. Let V be the vector space over R consisting of all polynomials of degree ^ n 
(for some integer n ^ 1). Let W be the subspace consisting of all polynomials 
of degree ^ n — 1. What is the linear map D induced by the derivative D on 
the factor space V/Wl 

5. Let K W be as in Exercise 1. Let A : V-* V be a linear map, and assume that 
AW cz W. Let {t^, ...,v n } be the basis of V as in Exercise 1. 

(a) Show that the matrix of A with respect to this basis is of type 


(Mi M 3 \ 

\ O Mj 

where is a square r x r matrix, and M 2 is a square (n — r) x (n — r) 
matrix. 

(b) In Exercise 2, show that the matrix of A with respect to the basis 
{v r+ is precisely the matrix M 2 . 


V, §6. FREE ABELIAN GROUPS 

We shall deal with commutative groups throughout this section. We 
wish to analyze under which conditions we can define the analogue of a 
basis for such groups. 

Let A be an abelian group. By a basis for A we shall mean a set of 
elements v u ...,v n (n ^ 1) of A such that every element of A has a unique 
expression as a sum 


c l v l 4- • •• 4- c n v n 

with integers c f eZ. Thus a basis for an abelian group is defined in a 
manner entirely similar to a basis for vector spaces, except that the coef- 
ficients c l9 ...,c n are now required to be integers. 
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Theorem 6.1. Let A be an abelian group with a basis {v l ,...,v n }. Let B 
be an abelian group , and let w u ...,w n be elements of B. Then there ex- 
ists a unique group-homomorphism f: A -> B such that f(v ,) = w f for all 
i = 1, . . . ,n. 

Proof Copy the analogous proof for vector spaces, omitting irrelevant 
scalar multiplication, etc. 

To avoid confusion when dealing with bases of abelian groups as 
above, and vector space bases, we shall call bases of abelian groups Z- 
bases. 

Theorem 6.2. Let A be a non-zero subgroup of R". Assume that in any 
bounded region of space there exists only a finite number of elements of 
A. Let m be the maximal number of elements of A which are linearly 
independent over R. Then we can select m elements of A which are 
linearly independent over R, and form a Z-basis of A. 

Proof Let {w w m } be a maximal set of elements of A linearly in- 
dependent over R. Let V be the vector space generated by these ele- 
ments, and let V m -i be the space generated by w u Let A m _ l 

be the intersection of A and V m -i. Then certainly, in any bounded region 
of space, there exists only a finite number of elements of A m _ 1 . 
Therefore, if m> 1, we could have chosen inductively {w l5 . . .,w m } such 
that {w 1 ,...,w m _ 1 } is a Z-basis of A m _ l . 

Now consider the set S of all elements of A which can be written in 
the form 

t lWi + + t m w m 


with 0 ^ ti < 1 if i = 1, . . . ,m — 1 and 0 ^ t m ^ 1. This set S is certainly 
bounded, and hence contains only a finite number of elements (among 
which is w m ). We select an element v m in this set whose last coordinate t m 
is the smallest possible > 0. We shall prove that 

is a Z-basis for A. Write v m as a linear combination of w 1 ,...,w m with 
real coefficients, 

Vm = C 1 W 1 + ••• + C mW m , 0<C m ^l. 

Let v be an element of A, and write 


v = x 1 w 1 + ••• + x m w m 
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with x f eR. Let q m be the integer such that 

^ (tfm *F ^ )^m * 

Then the last coordinate of v — q m v m with respect to {w 1? ...,w m } is equal 
to - q m c m9 and 


0 ^ x m - q m c m 

Let q t (i = 1, . . . ,m — 1) be integers such that 

<h ^ *, - q m c t < q t + 1. 

Then 

0 ) ” - 

is an element of S. If its last coordinate is not 0, then it would be an 
element with last coordinate smaller than c m9 contrary to the construc- 
tion of v m . Hence its last coordinate is 0, and hence the element of (1) 
lies in V m _ 1 . By induction, it can be written as a linear combination of 
w 1 ,...,w m _ 1 with integer coefficients, and from this it follows at once 
that v can be written as a linear combination of w l5 .. . ,w m _ l5 v m with 
integral coefficients. Furthermore, it is clear that w l5 . . . ,w m _ u v m are 
linearly independent over R, and hence satisfy the requirements of our 
theorem. 

We can now apply our theorem to more general groups. Let A be an 
additive group, and let f: A -> A be an isomorphism of A with a group 
A. If A admits a basis, say {i/ l5 . .. ,i/ n }, and if v t is the element of A such 
that f(v ,) = v' i9 then it is immediately verified that {v u . . . ,v n } is a basis of 
A. 


Theorem 6.3. Let A be an additive group , having a basis with n 
elements. Let B be a subgroup ^ {0}. Then B has a basis with ^ n 
elements. 

Proof. Let {v l9 ...,v n } be a basis for A. Let {e l ,...,e n } be the standard 
unit vectors of R". By Theorem 6.1, there is a homomorphism 

/: A ->R" 

such that f(vt) = e x for i = 1 and this homomorphism is obviously 
injective. Hence it gives an isomorphism of A with its image in R". On 
the other hand, it is trivial to verify that in any bounded region of R", 
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there is only a finite number of elements of the image f(A\ because in 
any bounded region, the coefficients of a vector 

(Cj, . . . ,c„) 

are bounded. Hence by Theorem 6.2 we conclude that f(B) has a Z- 
basis, whence B has a Z-basis, with ^ n elements. 

Theorem 6.4. Let A be an additive group having a basis with n ele- 
ments. Then all bases of A have this same number of elements n. 

Proof We look at our same homomorphism f: A -*R n as in the 
proof of Theorem 6.3. Let {w 1? be a basis of A. Each v t is a lin- 

ear combination with integer coefficients of w u ...,w m . Hence f(vf = e { is 
a linear combination with integer coefficients of /(wj, . . . ,f(w m ). Hence 
e l ,...,e n are in the space generated by By the theory of 

bases for vector spaces , we conclude that m^n, whence m = n. 

An abelian group is said to be finitely generated if it has a finite 
number of generators. It is said to be free if it has a basis. 

Corollary 6.5. Let A 7 ^ {0} be a finitely generated abelian group. As- 
sume that A does not contain any element of finite period except the 

unit element. Then A has a basis. 

Proof. The proof will be left as an exercise, see Exercises 1 and 2 
which also give you the main ideas for the proof. 

Remark. We have carried out the above theory in Euclidean space to 
emphasize certain geometric aspects. Similar ideas can be carried out to 
prove Corollary 6.5 without recourse to Euclidean space. See for in- 
stance my Algebra. 

Let A be an abelian group. In Chapter II, §7 we defined the torsion 
subgroup A tor to be the subgroup consisting of all the elements of finite 
period in A. We refer to Chapter II, Theorem 7.1. 

Theorem 6.6. Suppose that A is a finitely generated abelian group. 

Then A/A iOT is a free abelian group , and A is the direct sum 

A = A tor © F 


where F is free. 

Proof. Let {a u ...,a m } be generators of A. If a e A let a be its image 
in the factor group A/A tor . Then a u ... 9 H m are generators of A/A tor , 
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which is therefore finitely generated Let aeA/A tOT , and suppose a has 
finite period, say dM = 0 for some positive integer d. This means that 
da e A tor , so there exists a positive integer d' such that d'da = 0, whence a 
itself lies in A tor so * = 0. Hence the torsion subgroup of A/A lOT is trivial. 
By Corollary 6.5 we conclude that A/A tor is free. 

Let b u ... ,b r e A be elements such that {b h ...,b r } is a basis of A/A tOT . 
Then b u ...,b r are linearly independent over Z. Indeed, suppose d u ...,d r 
are integers such that 


d\b i “{■-*** A~ d r b r — 0. 

Then 

d 1 b 1 + h d r E r = 0, 

whence d t = 0 for all / by assumption that {b h ...,5 r } is a basis of A/A tor . 
Let F be the subgroup generated by b u ...,b r . Then F is free. We now 
claim that 


A = A tor © F. 

Indeed, let ae A. There exist integers x l5 . . . ,x r such that 
a = x^ H + x r b r . 


Hence a - (x^! H h x r B r ) = 0, so a - (x 1 b l H + x r b r ) e A tor . This 

proves that 


A = A Iot + F. 

In addition, suppose a e A tor n F. The only element of finite order in a 
free group is the 0 element. Hence a = 0. This proves the theorem. 


V, §6. EXERCISES 

1. Let A be an abelian group with a finite number of generators, and assume 
that A does not contain any element of finite period except the unit element. 
We write A additively. Let d be a positive integer. Show that the map xi — >dx 
is an injective homomorphism of A into itself, whose image is isomorphic to 
A. 

2. Let the notation be as in Exercise 1. Let be a set of generators of 

A. Let {< 2 l5 . . . ,a r } be a maximal subset linearly independent over Z. Let B be 
the subgroup generated by a l9 ...,a r . Show that there exists a positive integer 
d such that dx lies in B for all x in A. Using Theorem 6.3 and Exercise 1, 
conclude that A has a basis. 
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V, §7. MODULES OVER PRINCIPAL RINGS 

Throughout this section , we let R be a principal ring. 

Let M be an K-module, M # {0}. If M is generated by one element v 9 
then we say that M is cyclic. In this case, we have M = Rv. The map 

X I — * XV 

is a homomorphism of R into M , viewing both R and M as K-modules. 
Let J be the kernel of this homomorphism. Then J consists of all 
elements xeR such that xv = 0. Since R is assumed principal, either 
J = {0} or there is some element ae R which generates J. Then we have 
an isomorphism of K-modules 

R/J = R/aR * M. 

The element a is uniquely determined up to multiple by a unit of R, and 
we shall say that a is a period of v. Any unit multiple of a will also be 
called a period of v. 

Let M be an K-module. We say that M is a torsion module if given 
ve M there exists some element a e R, a # 0 such that av = 0. Let p be a 
prime element of R. We denote by M(p) the subset of elements veM 
such that there exists some power p r (r ^ 1) satisfying p r v = 0. These 
definitions are analogous to those made for finite abelian groups in 
Chapter II, §7. A module is said to be finitely generated if it has a finite 
number of generators. As with abelian groups, we say that M has 
exponent a if every element of M has period dividing a , or equivalently 
av = 0 for all ve M. 

Observe that if M is finitely generated and is a torsion module, then 
there exists ae R, a # 0, such that aM = {0}. (Proof?) 

The following statements are used just as for abelian groups, and also 
illustrate some notions given more generally in §4 and §5. 

Let p be a prime of R. Then R/pR is a simple module. 

Let a , be R be relatively prime. Let M be a module such that aM = 0. 
Then the map 

v\-+bv 


is an automorphism of M. 

The proofs will be left as exercises. 

The next theorem is entirely analogous to Theorem 7.1 of Chapter II. 

Theorem 7.1. Let M be a finitely generated torsion module over the 
principal ring R. Then M is the direct sum of its submodules M(p) for 
all primes p such that M(p) ^ {0}. In fact , let ae R be such that 
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aM = 0, and suppose we can write 

a = be with b , c relatively prime. 

Let M b be the subset of M consisting of those elements v such that 
bv = 0, and similarly for M c . Then 


M = M b ® M c . 


Proof. You can just copy the proof of Theorem 7.1 of Chapter II, 
since all the notions which we used there for abelian groups have been 
defined for modules over a principal ring. This kind of translation just 
continues what we did in Chapter IV, §6 for principal rings themselves. 


Theorem 7.2. Let M be a finitely generated torsion module over the 
principal ring R , and assume that there is a prime p such that 
p r M = {0} for some positive integer r. Then M is a direct sum of cyclic 
submodules 


<z 

M = (0 Rv t where Rv t « R/p r R , 

i= 1 

so v { has period p r ‘. If we order these modules so that 

^ ^2 = *“ = r s = 

then the sequence of integers r u ...,r s is uniquely determined. 

Proof. Again, the proof is similar to the proof of Theorem 7.2 in 
Chapter II. We repeat the proof here for convenience of the reader, so 
that you can see how to translate a slightly more involved proof from the 
integers to principal rings. 

We start with a remark. Let veM, v^O. Let k be an integer ^0 
such that p k v # 0 and let p m be a period of p k v. Then v has period p k+m . 
Proof: We certainly have p k + m v = 0, and if p n v = 0 then first n^ k, and 
second n ^ k + m, otherwise the period of p k v would divide p m and not be 
equal to p m , contrary to the definition of p m . 

We shall now prove the theorem by induction on the number of 
generators. Suppose that M is generated by q elements. After reordering 
these elements if necessary, we may assume that one of these elements i; x 
has maximal period. In other words, v 1 has period p r \ and if veM then 
p r v = 0 with r ^ r v We let be the cyclic module generated by iq. 
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Lemma 7.3. Let v be an element of M/M l5 of period p r . Then there 
exists a representative w of v in M which also has period p r . 

Proof. Let v be any representative of v in M. Then p r v lies in M 1? say 
p r v = cv Y with ce R. We note that the period of v divides the period of v. 
Write c = p k t where t is prime to p. Then tv l is also a generator of 
(proof?), and hence has period p n . By our previous remark, the element v 
has period 

r +ri -k 

P » 

whence by hypothesis, r + r { — k S ?i and r S k. This proves that there 
exists an element w x e M l such that p r v = p r w v Let w = v — Then w 
is a representative for v in M and p r w = 0. Since the period of w is at 
least p r we conclude that w has period equal to p r . This proves the 
lemma. 

We return to the main proof. The factor module M/M 1 is generated 
by q — 1 elements, and so by induction M/M i has a direct sum 
decomposition 


M/M 1 = M 2 ©-®M s 

into cyclic modules with M t % R/p n R. Let Vf be a generator for M; 
(i = 2, ... ,5), and let v { be a representative in M of the same period as v { 
according to the lemma. Let M i be the cyclic module generated by v { . 
We contend that M is the direct sum of Af 1? ...,M s , and we now prove 
this. 

Given veM , let v denote its residue class in M/M i . Then there exist 
elements c 2 , ... ,c s e R such that 

v = c 2 v 2 + ••■ + c s v s . 

Hence v — (c 2 v 2 + • * • + c s v s ) lies in M u and there exists an element c 1 e R 
such that 


v = c l v 1 + c 2 v 2 H + c s v s . 


Hence M = M { - !-••• + M s . 

Conversely, suppose c 1 ,...,c s are elements of R such that 
c i v i + • * • + c s v s = 0. 

Since v t has period p ri ( i = 1, if we write c t = p mi ti with p f t i9 then 
we may suppose < r t . Putting a bar on this equation yields 


C 2 v 2 + ••• + c s v s = 0. 
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Since M/M l = M is a direct sum of we conclude that CjVj = 0 

for j = 2,..., 5. (See part (c) of Proposition 4.4.) Hence p rj divides Cj for 
j = 2, ... ,s, whence also CjVj = 0 for j = 2, . . . ,s since Vj and Vj have the 
same period. This proves that M is the direct sum of M \ , . . . , M s and 
concludes the proof of the existence part of the theorem. 

Now we prove uniqueness. If 

M % R/p r 'R® - ®R/p r R 

then we say that M has type (p r ‘, . . . ,p r ), just as we did for abelian 
groups. Suppose that M is written in two ways as a product of cyclic 
submodules, say of types 

(p n ,...,P r ) and (p m \...,p mk ) 


with r l ^ r 2 ^ ^ r s ^ 1, and m i ^ m 2 ^ ^ m k ^ 1. Then pM is also 

a torsion module, of type 

(p r,_1 ,...,p r ‘“ 1 ) and (p^- 1 ,...^- 1 ). 

It is understood that if some exponent or nij is equal to 1, then the 
factor corresponding to 


p r> 1 or p m> 1 

in pM is simply the trivial module 0. We now make an induction on the 
sum r l + ■■■ + r s , which may be called the length of the module. If this 
length is 1 in some representation of M as a direct sum, then this length 
is 1 in every representation, because R/pR is a simple module, whereas it 
is immediately verified that if the length is > 1, then there exists a 
submodule ^ 0 and ^ R , so the module cannot be simple. Thus the 
uniqueness is proved for modules of length 1. 

By induction, the subsequence of (r 1 — l,...,r s — 1) consisting of those 
integers ^ 1 is uniquely determined, and is the same as the corresponding 
subsequence of (m : — 1, . . . ,m k — 1). In other words, we have 
r t — 1 = mi — 1 for all those integers i such that — 1 and — 1 ^ 1. 
Hence r { = for all these integers i, and the two sequences 

(pV..,P rs ) and (p m \...,p m ) 

can differ only in their last components which can be equal to p. These 
correspond to factors of type (p, . . . ,p) occurring say v times in the first 
sequence, and p times in the second sequence. We have to show that 
v = p. 
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Let M p be the submodule of M consisting of all elements veM such 
that pv = 0. Since by hypothesis pM p = 0, it follows that M p is a module 
over R/pR , which is a field, so M p is a vector space over R/pR. If N is a 
cyclic module, say N = R/p r R, then N p = p r l R/p r R % R/pR, and so N p 
has dimension 1 over R/pR. Hence 

dim^y^ M p = s and also = /c, 

so s = k. But we have seen in the preceding paragraph that 

(p n , . . . ,//') = (//■, . . . ,p r '-\ p, . . . ,p), 

v times 

times 

and that s — v = k — p. Since s = k it follows that v = /i, and the 
theorem is proved. 

Remark. In the analogous statement for abelian groups, we finished 
the proof by considering the order of groups. Here we use an argument 
having to do with the dimension of a vector space over R/pR. Otherwise, 
the argument is entirely similar. 

V, §7. EXERCISES 

1. Let a, b be relatively prime elements of R. Let M be an R-module, and denote 
by a M : M -* M multiplication by a. In other words, a M (v) = av for veM. 
Suppose that a M b M = 0. Prove that 

Im a M = Ker b M 

2. Let M be a finitely generated torsion module over R. Prove that there exists 
a e R, a / 0 such that a M = 0. 

3. Let p be a prime of R. Prove that R/pR is a simple R-module. 

4. Let aE R, a # 0. If a is not prime, prove that R/aR is not a simple R-module. 

5. Let a , b e R be relatively prime. Let M be a module such that a M = 0. Prove 
that the map v\ —>bv is an automorphism of M. 


V, §8. EIGENVECTORS AND EIGENVALUES 

Let V be a vector space over a field K, and let 


A:V -*V 
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be a linear map of V into itself. An element ve V is called an eigenvector 
of A if there exists X e K such that Av = Xv. If v ^ 0 then X is uniquely 
determined , because X x v = X 2 v implies X x = X 2 . In this case, we say that X 
is an eigenvalue of A belonging to the eigenvector v. We also say that v 
is an eigenvector with the eigenvalue X . Instead of eigenvector and 
eigenvalue, one also uses the terms characteristic vector and characteristic 
value. 

If A is a square n x n matrix then an eigenvector of A is by definition 
an eigenvector of the linear map of K n into itself represented by this 
matrix. Thus an eigenvector X of A is a (column) vector of K n for which 
there exists XeK such that AX = XX. 


Example 1. Let V be the vector space over R consisting of all 
infinitely differentiable functions. Let XeR. Then the function / such 
that f{t) — e kt is an eigenvector of the derivative djdt because df jdt = Xe kt . 


Example 2. Let 



be a diagonal matrix. Then every unit vector E l (i = 1, ... ,n) is an 
eigenvector of A. In fact, we have AE l = a { E l : 



0 0 



Example 3. If A: V -> V is a linear map, and v is an eigenvector of A, 
then for any non-zero scalar c, cv is also an eigenvector of A, with the 
same eigenvalue. 


Theorem 8.1. Let V be a vector space and let A: V — ► V be a linear map. 
Let XeK. Let V k be the subspace of V generated by all eigenvectors of 
A having X as eigenvalue. Then every non-zero element of V k is an 
eigenvector of A having X as eigenvalue. 

Proof. Let v l9 v 2 e V be such that Av 1 = Xv 1 and Av 2 = Xv 2 . Then 


A(v t ^ 2 ) — Av± + Av 2 — Xvi "b Xv 2 — -f- vf). 


If ceK then A(cv x ) = cAv l = cXv l = Xcv t . This proves our theorem. 
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The subspace V k in Theorem 8.1 is called the eigenspace of A 
belonging to A. 

Note. If v l9 v 2 are eigenvectors of A with different eigenvalues A x # A 2 
then of course -I- v 2 is not an eigenvector of A. In fact, we have the 
following theorem: 

Theorem 8.2. Let V be a vector space and let A : V -> V be a linear map. 
Let v u ...,v m be eigenvectors of A, with eigenvalues A x , . . . ,A m respec- 
tively. Assume that these eigenvalues are distinct , i.e. 

if i^j- 

Then v i9 ...,v m are linearly independent. 

Proof. By induction on m. For m = 1, an element v l eV, v i ^ 0 is 
linearly independent. Assume m > 1. Suppose that we have a relation 

(*) CiVi + ■■■ + C m v m = 0 

with scalars c t . We must prove all c t = 0. We multiply our relation (*) 
by /.! to obtain 

C l^l y l + ••* + = 0. 

We also apply A to our relation (*). By linearity, we obtain 

C l A 1 V 1 H + C mfn V m = 0* 

We now subtract these last two expressions, and obtain 

C l(^2 ~ ^ l) V 2 + • " + C m(^m — ^lHn = 0- 

Since A 7 - — A x / 0 for j = 2, . . . 9 m we conclude by induction that 

c 2 = ■ ■ ■ = c m = 0. 

Going back to our original relation, we see that Cjt^ = 0, whence c : = 0, 
and our theorem is proved. 

Quite generally, let V be a finite dimensional vector space, and let 


L: V -> V 


[V, §8] 


EIGENVECTORS AND EIGENVALUES 


217 


be a linear map. Let {v u ..,,v n } be a basis of V. We say that this basis 
diagonalizes L if each v t is an eigenvector of L, so Lv t = c^i with some 
scalar c t . Then the matrix representing L with respect to this basis is the 
diagonal matrix 


We say that the linear map L can be diagonalized if there exists a basis of 
V consisting of eigenvectors. We say that an n x n matrix A can be 
diagonalized if its associated linear map L A can be diagonalized. 

We shall now see how we can use determinants to find the eigenvalue 
of a matrix. We assume that readers are acquainted with determinants. 

Theorem 8.3. Let V be a finite dimensional vector space , and let X be a 
number. Let A : V -► V be a linear map. Then X is an eigenvalue of A if 
and only if A — XI is not invertible. 

Proof. Assume that X is an eigenvalue of A. Then there exists an 
element veV, v / 0 such that Av = Xv. Hence Av — Xv = 0, and 
{A — XI)v = 0. Hence A — XI has a non-zero kernel, and A — XI cannot 
be invertible. Conversely, assume that A — XI is not invertible. By 
Theorem 2.4 we see that A — XI must have a non-zero kernel, meaning 
that there exists an element veV , v / 0 such that (A — XI)v = 0. Hence 
Av — Xv = 0, and Av = Xv. Thus X is an eigenvalue of A. This proves 
our theorem. 

Let A be an n x n matrix, A = (a if We define the characteristic 
polynomial P A to be the determinant 



P A (t) = Det (tl - A), 


or written out in full, 


P(t) = 


t-a, 




We can also view A as a linear map from K n to K n , and we also say 
that P A (t) is the characteristic polynomial of this linear map. 
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Example 4. The characteristic polynomial of the matrix 

/ 1 — 1 3 

A =1-2 1 1 

\ 0 1-1 
is 

t- 1 1 -3 

2 t - 1 -1 , 

0 -1 t + 1 

which we expand according to the first column, to find 

P^t) = t 3 - f 2 - 4t + 6. 

For an arbitrary matrix A = (a 0 ), the characteristic polynomial can be 
found by expanding according to the first column, and will always consist 
of a sum 


Each term other than the one we have written down will have degree 
< n. Hence the characteristic polynomial is of type 

P A (t) = t n + terms of lower degree. 

For the next theorem, we assume that you know the following 
property of determinants: 

A square matrix M over a field K is invertible if and only if its 
determinant is ^ 0. 

Theorem 8.4. Let A be an n x n matrix. An element X e K is an 
eigenvalue of A if and only if X is a root of the characteristic 
polynomial of A. If K is algebraically closed , then A has an eigenvalue 
in K. 

Proof. Assume that X is an eigenvalue of A. Then XI — A is not 
invertible by Theorem 8.3 and hence Det (XI — A) = 0. Consequently X is 
a root of the characteristic polynomial. Conversely, if X is a root of the 
characteristic polynomial, then 

Det (XI - A) = 0, 

and hence we conclude that XI — A is not invertible. Hence X is an 
eigenvalue of A by Theorem 8.3. 
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Theorem 8.5. Let A, B be two n x n matrices , and assume that B is 
invertible. Then the characteristic polynomial of A is equal to the 
characteristic polynomial of B~ l AB . 

Proof. By definition, and properties of the determinant, 

Det (tl - A) = Det(£ -l (f/ - A)B) = Det (tB~ l B - B~ l AB) 

= Det(tl — B~ l AB). 

This proves what we wanted. 


be a linear map of a finite dimensional vector space into itself, so L is 
also called an endomorphism. Select a basis for V and let A be the 
matrix associated with L with respect to this basis. We then define the 
characteristic polynomial of L to be the characteristic polynomial of A. If 
we change basis, then A changes to B~ l AB where B is invertible. By 
Theorem 8.5 this implies that the characteristic polynomial does not 
depend on the choice of basis. 

Theorem 8.4 can be interpreted for L as stating: 

Let K be algebraically closed. 

Let V be a finite dimensional vector space over K of dimension > 0. 

Let L: V -> V be an endomorphism. Then L has a non-zero eigenvector 

and an eigenvalue in K. 


V, §8. EXERCISES 

1. Let V be an ^-dimensional vector space and assume that the characteristic 
polynomial of a linear map A : V -* V has n distinct roots. Show that V has a 
basis consisting of eigenvectors of A. 

2. Let A be an invertible matrix. If k is an eigenvalue of A show that k ^ 0 and 
that k' 1 is an eigenvalue of A~L 

3. Let V be the space generated over R by the two functions sin t and cos t. 
Does the derivative (viewed as a linear map of V into itself) have any nonzero 
eigenvectors in VI If so, which? 

4. Let D denote the derivative which we view as a linear map on the space of 
differentiable functions. Let k be an integer ^ 0. Show that the functions 
sin kx and cos kx are eigenvectors for D 2 . What are the eigenvalues? 
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5. Let A: V -> V be a linear map of V into itself, and let {u l5 .. . ,u„} be a basis of 
V consisting of eigenvectors having distinct eigenvalues c 1? . . . ,c„. Show that 
any eigenvector v of A in V is a scalar multiple of some v 

6. Let A , B be square matrices of the same size. Show that the eigenvalues of AB 
are the same as the eigenvalues of BA. 

7. (Artin’s theorem.) Let G be a group, and let f l9 ...,/„: G -> K* be distinct 
homomorphisms of G into the multiplicative group of a field. In particular, 
/ l5 ...,/„ are functions of G into K. Prove that these functions are linearly 
independent over K. [Hint: Use induction in a way similar to the proof of 
Theorem 8.2.] 

V, §9. POLYNOMIALS OF MATRICES AND LINEAR MAPS 

Let n be a positive integer. Let Mat„(/C) denote the set of all n x n 
matrices with coefficients in a field K. Then Mat/K) is a ring, which is a 
finite dimensional vector space over K , of dimension n 2 . Let A e Mat„(K). 
Then A generates a subring, which is commutative because powers of A 
commute with each other. Let K[f] denote the polynomial ring over K. 
As a special case of the evaluation map, if /(f)eK[f] is a polynomial, we 
can evaluate / at A. Indeed: 

If /(f) = a n t n H + a 0 then f(A) = a n A n + • • - + a 0 I. 

We know that the evaluation map is a ring homomorphism, so we have 
the rules: 

Let /, geK[t\ and ceK. Then : 

if + g)(A) = f{A) + g(A), 

(fgU) = f(A)g(A), 

(cf)(A) = cf(A). 

Example. Let be elements of K. Let 


Then 


fit) = (t - zJ-'-it - oc„). 


f(A) = (A — otj/) ■■■(A — a„/). 


Let K be a vector space over K , and let A : V -> V be an endo- 
morphism (i.e. linear map of V into itself). Then we can form 
A 2 = A o A = AA, and in general A n = iteration of A taken n times for 
any positive integer n. We define A 0 = l (where / now denotes the 
identity mapping). We have 


A m + n — A m A‘ 
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for all integers m, n g: 0. If / is a polynomial in then we can form 

f(A) the same way that we did for matrices. The same rules are satisfied, 
expressing the fact that f (A) is a ring homomorphism. The image of 
K[t] in End*(F) under this homomorphism is the commutative subring 
denoted by K\_< 4]. 

Theorem 9.1. Let A be an n x n matrix in a field K , or let A: V V be 

an endomorphism of a vector space V of dimension n. Then there exists 

a non- zero polynomial f e/C[t] such that f(A) = O. 

Proof. The vector space of n x n matrices over K is finite dimensional, 
of dimension n 2 . Hence the powers 

/, A, A 2 , . . . ,A N 

are linearly dependent for N > n 2 . This means that there exist numbers 
a 0 , . . . ,a w e K such that not all a t = 0, and 

a N A N + • • • + aol = 0. 

We let f(t) = a N t N + -f a n to get what we want. The same proof 
applies when A is an endomorphism of V. 

If we divide the polynomial / of Theorem 9.1 by its leading coefficient, 
then we obtain a polynomial g with leading coefficient 1 such that 
g(A) = O. It is usually convenient to deal with polynomials whose 
leading coefficient is 1, since it simplifies the notation. 

The kernel of the evaluation map f \—>f (A) is an ideal in K[t], which 
is principal, and so generated by a unique monic polynomial which is 
called the minimal polynomial of A in K[tf Since the ring generated by 
A over K may have divisors of zero, it is possible that this minimal 
polynomial is not irreducible. This is the first basic distinction which we 
encounter from the case when we evaluated polynomials in a field. We 
shall prove at the end of the section that if P A is the characteristic 
polynomial, then P a (A) = O. Therefore P A (t) is in the kernel of the map 
/ 1 —>f(A), and so the minimal polynomial divides the characteristic 
polynomial since the kernel is a principal ideal. 

Let V be a vector space over K , and let A : V -> V be an endomor- 
phism. Then we may view V as a module over the polynomial ring K[t] 
as follows. If ve V and f(t)eK[t], then f{A): V -> V is also an endomor- 
phism of V, and we define 


f(t)v = f(A)v. 
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The properties needed to check that V is indeed a module over K[t] are 
trivially verified. The big advantage of dealing with V as a module over 
K[f] rather than over K[A\ for instance, is that K[f] is a principal ring, 
and we can apply the results of §7. 

Warning. The structure of /C[t]-module depends of course on the 
choice of A. If we selected another endomorphism to define the 
operation of K[t ] on V, then this structure would also change. Thus we 
denote by V A the module V over K[r] determined by the endomorphism 
A as above. Theorem 9.1 can now be interpreted as stating: 

The module V A over K[f] is a torsion module. 

Therefore Theorems 7.1 and 7.2 apply to give us a description of V A as a 
/C|/]-module. We shall make this description more explicit. 

Let IT be a subspace of V. We shall say that W is an invariant 
subspace under A, or its /1-invariant, if Aw lies in W for all we IT, i.e. if 
A W is contained in IT. It follows directly from the definitions that: 

A subspace W is A-invariant if and only if W is a K[t]-submodule. 

Example 1. Let be a non-zero eigenvector of A , and let V x be the 
1-dimensional space generated by v v Then V 1 is an invariant subspace 
under A. 

Example 2. Let k be an eigenvalue of A, and let V x be the subspace of 
V consisting of all veV such that Av = kv. Then V k is an invariant 
subspace under A, called the eigenspace of k. 

Example 3. Let / ( t ) e Kft'] be a polynomial, and let IT be the kernel 
of f(A). Then IT is an invariant subspace under A. 

Proof. Suppose that f(A)w = 0. Since tf(t) = f(t)t , we get 

Af(A) = /(/l)/l, 

whence 


f(A)(Aw) = f(A)Aw = Af(A)w = 0. 

Thus Aw is also in the kernel of f(A ), thereby proving our assertion. 

Translating Theorem 7.1 into the present situation yields: 

Theorem 9.2. Let f(t)eK[t ] be a polynomial , and suppose that 
f = /i/ 2 , where f u f 2 are polynomials of degree ^ 1, and relatively 


[V, §9] 


POLYNOMIALS OF MATRICES AND LINEAR MAPS 


223 


prime. Let A.V-+V be an endomorphism. Assume that f(A)= O. Let 

= kernel of ffA) and W 2 = kernel of f 2 {A). 

Then V is the direct sum of W 1 and W 2 . In particular , suppose that f(t) 
has a factorization 


with distinct roots a l9 . . . ,oc m e K. Let W { be the kernel of (A — dilf 1 . 
Then V is the direct sum of the sub spaces W ly ... y W m . 

Remark. If the field K is algebraically closed, then we can always 
factor the polynomial f(t) into factors of degree 1 as above and the 
different powers (t — a 1 ) n , . . . y (t — ot m ) r,n are relatively prime. This is of 
course the case over the complex numbers. 

Example 4 (Differential equations). Let V be the space of (infinitely 
differentiable) solutions of the differential equation 

with constant complex coefficients a t . We shall determine a basis of V. 

Theorem 9.3. Let 


P(t) = t n + 1 + ••• + a 0 . 

Factor P(t) as in Theorem 9.2 

P{t) = (t- a 1 ) n •••(t - O'” 

Then V is the direct sum of the spaces of solutions of the differential 
equations 

(D ~ OLjff = 0, 

for i = l,...,m. 

Proof. This is merely a direct application of Theorem 9.2. 

Thus the study of the original differential equation is reduced to the 
study of the much simpler equation 


(D - odYf = 0. 


The solutions of this equation are easily found. 
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Theorem 9.4. Let a be a complex number. Let W be the space of 
solutions of the differential equation 

(D - a/)7 = 0. 

Then W is the space generated by the functions 

<?*', re*',.../ - V' 

and these functions form a basis for this space , which therefore has 
dimension r. 

Proof. For any complex a we have 

(D - odjf = e*D\e '*'/). 

(The proof is a simple induction.) Consequently, / lies in the kernel of 
(D — xiy if and only if 


D r {e ~* '/) = 0. 


The only functions whose r-th derivative is 0 are the polynomials of 
degree ^ r — 1. Hence the space of solutions of ( D — ocI) r f = 0 is the 
space generated by the functions 

e a \ te *\ . . . /' V'. 

Finally these functions are linearly independent. Suppose we have a 
linear relation 


+ cfe* 1 + • * ■ 4- c r _ x f “ V' = 0 
for all r, with constants c 0 ,..,,c r _ 1 . Let 


6(0 — c o + + * * * + c r- 1 • 

Then Q(t) is a non-zero polynomial, and we have 

Q(t)e at = 0 for all r. 

But e* # 0 for all t so Q(t) = 0 for all t. Since Q is a polynomial, we 
must have c t = 0 for i = 0, . . . ,r — 1 thus concluding the proof. 
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We end this chapter by looking at the meaning of Theorem 7.2 for a 
finite dimensional vector space V over an algebraically closed field K. 
Let 

A : V -+V 


be an endomorphism as before. We first make explicit the cyclic case. 

Lemma 9.5. Let v e V, v # 0. Suppose there exists ole K such that 
(A — olIJv = 0 for some positive integer r, and that r is the smallest such 
positive integer. Then the elements 

v , (A - olI)v, ...,(A - cciy ^v 

are linearly independent over K. 

Proof. Let B = A — od for simplicity. A relation of linear dependence 
between the above elements can be written 


f(B)v = 0, 


where / is a polynomial ^ 0 of degree ^ r — 1 , namely 


c 0 v + c x Bv + • • ■ + c s B s v = 0, 


with f(t) = c 0 + eft + ■ ■ ■ -f c s t s , and s ^ r — 1. We also have B r v = 0 by 
hypothesis. Let g(t) = t r . If h is the greatest common divisor of / and g , 
then we can write 


h = fif + 9i9. 


where f u g { are polynomials, and thus h(B) = f 1 (B)f(B) + g 1 (B)g(B\ It 
follows that h(B)v = 0. But h(t) divides t r and is of degree ^ r — 1, so 
that h(t) = t d with d < r. This contradicts the hypothesis that r is 
smallest, and proves the lemma. 

The module V A is cyclic over K[t~\ if and only if there exists an 
element veV, v ^ 0 such that every element of V is of the form f(A)v for 
some polynomial f(t)EK[t]. Suppose that V A is cyclic, and in addition 
that there is some element oleK and a positive integer r such that 
(A — ol Ifv = 0. Also let r be the smallest such positive integer. Then the 
minimal polynomial of A on V is precisely (t — a) r . Then Lemma 9.5 
implies that 

(*) 


{(A — OLiy l v, ... ,(A — olI)v , v} 
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is a basis for V over K. With respect to this basis, the matrix of A is 
then particularly simple. Indeed, for each k we have 


A(A — ot I) k v = (A — otI) k + l v -f ot(A — otI) k v. 


By definition, it follows that the associated matrix for A with respect to 
this basis is equal to the triangular matrix 


'a 1 0 ••• 0 0 \ 

0 a 1 ••• 0 0 


0 0 0 
0 0 0 


: 0 

Ot 1 1 

0 Ot j 


This matrix has ot on the diagonal, 1 above the diagonal, and 0 
everywhere else. The reader will observe that (A — otI) rl v is an eigen- 
vector for A, with eigenvalue a. 

The basis (*) is called a Jordan basis for V with respect to A. Thus 
over an algebraically closed field, we have found a basis for a cyclic 
vector space as above such that the matrix of A with respect to this basis 
is particularly simple, and is almost diagonal. If r = 1, that is if Av = otv , 
then the matrix is a 1 x 1 matrix, which is diagonal. 

We now turn to the general case. We can reformulate Theorem 7.2 as 
follows. 


Theorem 9.6. Let V be a finite dimensional space over the algebraically 
closed field K , and V # {0}. Let A: V -» V be an endomorphism. Then 
V is a direct sum of A-invariant subspaces 

V= K 1 ©---07 € 

such that each V t is cyclic , generated over K[t] by an element v { # 0, 
and the kernel of the map 


f(t)^f(A) Vi 

is a power ( t — otf 1 for some positive integer r t and ot t e K. 

If we select a Jordan basis for each V h then the sequence of these 
bases forms a basis for V, again called a Jordan basis for V with respect 
to A. With respect to this basis, the matrix for A therefore splits into 
blocks (Fig. 1). 
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Figure 1 


In each block we have an eigenvalue oc* on the diagonal. We have 1 
above the diagonal, and 0 everywhere else. This matrix is called the 

Jordan normal form for A. 

It is to be understood that if r t = 1, then there is no 1 above the 
diagonal, and the eigenvalue a ; is simply repeated, the number of times 
being the dimension of the corresponding eigenspace. 

The Jordan normal form also allows us to prove the Cayley-Hamilton 
theorem as a corollary, namely: 

Theorem 9.7. Let A be an n x n matrix over a field K , and let P A (t) be 

its characteristic polynomial. Then P a {A) = O. 

Proof. We assume for this proof that K is contained in some 
algebraically closed field, and it will then suffice to prove the theorem 
under the assumption that K is algebraically closed. Then A represents 
an endomorphism of K n , which we take as V. We denote the endomor- 
phism by the same letter A. We decompose V into a direct sum as in 
Theorem 9.6. Then the characteristic polynomial of A is given by 

PJr) = fl (t - 0C;) ri 

i-1 


where r, = dim K (^). But P a (A) = Y\j=i( A ~ and by Theorem 9.6, 

(A - I Y'v, = 0. 
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Hence P A (A)v i = 0 for all i. But V is generated by the elements f{A)v i 
for all i with / e K[tf and 


P A (A)f(A) Vi = f(A)P A (A)v i = 0. 


Hence P a (A)v = 0 for all ve V, whence P a (A) = 0, as was to be proved. 


V, §9. EXERCISES 

1. Let M be an n x n diagonal matrix with eigenvalues /, u • • • A r - Suppose that 
/ H has multiplicity m { . Write down the minimal polynomial of M, and also 
write down its characteristic polynomial. 

2. In Theorem 9.2 show that image of fi(A) = kernel of f 2 (A). 

3. Let F be a finite dimensional vector space, and let A: V -> V be an 
endomorphism. Suppose A 2 = A. Show that there is a basis of V such that 
the matrix of A with respect to this basis is diagonal, with only 0 or 1 on the 
diagonal. Or, if you prefer, show that V = L 0 ® V 1 is a direct sum, where 
V 0 = Ker A and V l is the ( + 1 )-eigenspace of A. 

4. Let A: V -+ V be an endomorphism, and V finite dimensional. Suppose that 
A 3 = A. Show that V is the direct sum 

v= V 0 ®Vt®V-u 

where V 0 = Ker A, V x is the (+ 1 )-eigenspace of A, and V_ { is the ( — 1 )- 
eigenspace of A. 

5. Let A: V -> V be an endomorphism, and V finite dimensional. Suppose that 
the characteristic polynomial of A has the factorization 

P.M = (r — «i) ■ - - (t — «„), 

where . . . , a„ are distinct elements of the field K. Show that V has a basis 
consisting of eigenvectors for A. 

For the rest of the exercises , we suppose that V # {0}, and that V is finite 
dimensional over the algebraically closed field K. We let A : V ---> V be an 
endomorphism. 

6. Prove that A is diagonalizable if and only if the minimal polynomial of A has 
all roots of multiplicity 1. 

7. Suppose A is diagonalizable. Let W be a subspace of V such that AW a W. 
Prove that the restriction of A to W is diagonalizable as an endomorphism 
of W. 

8. Let B be another endomorphism of V such that BA = A B. Prove that A, B 
have a common non-zero eigenvector. 
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9. Let B be another endomorphism of V. Assume that AB = BA and both A, B 
are diagonalizable. Prove that A and B are simultaneously diagonalizable, 
that is V has a basis consisting of elements which are eigenvectors for both A 
and B. 

10. Show that A can be written in the form A = D + N, where D is a 
diagonalizable endomorphism, N is nilpotent, and DN = ND. 

1 1 . Assume that V A is cyclic, annihilated by (A — cd) r for some r > 0 and cceK. 
Prove that the subspace of V generated by eigenvectors of A is one- 
dimensional. 

12. Prove that V A is cyclic if and only if the characteristic polynomial P A (t) is 
equal to the minimal polynomial of A in K\_Q. 

13. Assume that V A is cyclic annihilated by (A — a/) r for some r > 0. Let / be a 
polynomial. What are the eigenvalues of f(A) in terms of those of A? Same 
question when V is not assumed cyclic. 

14. Let P A be the characteristic polynomial of A , and write it as a product 


p/t) = n (f - 

i — 1 


where a ls ..., a m are distinct. Let / be a polynomial. Express the characteris- 
tic polynomial P f{A) as a product of factors of degree 1. 

15. If A is nilpotent and not O, show that A is not diagonalizable. 

16. Suppose that A is nilpotent. Prove that V has a basis such that the matrix of 
A with respect to this basis has the form 




where = (0) or N t 


( 0 1 0 ••• o\ 

0 0 1 ... 0 

0 0 0 ••• 0 ! 


The matrix on the right has components 0 except for l’s just above the 
diagonal. 


Invariant subspaces 

Let S be a set of endomorphisms of V. Let W be a subspace of V. We shall say 
that W is an S-invariant subspace if BW a W for all Be S. We shall say that V is 
a simple S-space if V ^ {0} and if the only S-invariant subspaces are V itself and 
the zero subspace. Prove: 

17. Let A: V -> V be an endomorphism such that AB = BA for all Be S. 

(a) The image and kernel of A are S-invariant subspaces. 

(b) Let /(t) e K[tl Then f(A)B = Bf(A) for all Be S. 

(c) Let U , W be S-invariant subspaces of V. Show that U + W and U n W 
are S-invariant subspaces. 
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18. Assume that V is a simple S-space and that AB = BA for all BeS. Prove 

that either A is invertible or A is the zero map. Using the fact that V is finite 

dimensional and K algebraically closed, prove that there exists cce K such 
that A = «/. 

19. Let V be a finite dimensional vector space over the field K , and let S be the 
set of all linear maps of V into itself. Show that V is a simple S-space. 

, (\ *\ 

20. Let V = R , let S consist of the matrix viewed as linear map of V 

V° 1 / 

into itself. Here, a is a fixed non-zero real number. Determine all S-invariant 
subspaces of V. 

21. Let V be a vector space over the field K , and let {i^, . . . ,v„} be a basis of V. 

For each permutation a of {1, ... ,n} let A a : V -> V be the linear map such 

that 

(a) Show that for any two permutations <r, i we have 

A x A^y 

and A id = /. 

(b) Show that the subspace generated by v = v 1 + • • • + v„ is an invariant sub- 
space for the set S n consisting of all A a . 

(c) Show that the element v of part (b) is an eigenvector of each A a . What is 
the eigenvalue of A a belonging to v? 

(d) Let n = 2, and let a be the permutation which is not the identity. Show 
that v 1 — v 2 generates a 1 -dimensional subspace which is invariant under 
A a . Show that v 1 — v 2 is an eigenvector of A a . What is the eigenvalue? 

22. Let V be a vector space over the field K, and let A: V -» V be an 
endomorphism. Assume that A r = / for some integer r ^ 1. Let 

T = / + ,4 + ■ • • + A r " 1 . 

Let v 0 be an element of V. Show that the space generated by Tv 0 is an 
invariant subspace of A, and that Tv 0 is an eigenvector of A. If Tv 0 # 0, 
what is the eigenvalue? 

23. Let (Vy A) and (W, B) be pairs consisting of a vector space and endo- 
morphism, over the same field K. We define a morphism 

f:(V, A)^(Wy B) 

to be a homomorphism f:V~*W of X-vector spaces satisfying in addition 
the condition 


Bo f = f. A. 
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/ f 

W * W 

B 

In other words, for all veV we have B(f(v)) = f(A(v)). An isomorphism of 
pairs is a morphism which has an inverse. 

Prove that (F, A) is isomorphic to (W, B) if and only if V A and W B are 
isomorphic as K[t]-modules. (The operation of K[r] on V A is the one 
determined by A, and the operation of /C[t] on W B is the one determined 
by B.) 

A direct sum decomposition of matrices 

24. Let F be a field and Mat„(F) = M n the ring of n x n matrices over F. Let Eg for 
/, j « 1 , . . . , n be the matrix with (//)-component 1, and all other components 0. 
Then the set of elements Eg is a basis for M n . Let D* = D*(F ) be the multipli- 
cative group of diagonal matrices with non-zero diagonal components. We write 
such matrices as diag(ui, a. We define the conjugation action of D* on 

M n by 


c(a)X =aXa~ l . 

(a) Show that a h- ► c(a) is a homomorphism from D* into the group of linear 
automorphisms of M n . 

(b) Show that each Eg is an eigenvector for the action of c(a), the eigenvalue be- 
ing given by x v (a) = a t /aj. 

Thus M n is a direct sum of eigenspaces. Each %g- > F* is a character, i.e. 

a homomorphism of D* into the multiplicative group of F. A general context 
will be given in Chapter VI, §6. 

25. For two matrices X, Y e M„(F), define [X, Y] = XY - YX. Let L x : M n -> M n 
denote the map such that L X {Y) = [ X , Y]. One calls L x the bracket (or Lie) 
action of X onM„, and [X, Y] the Lie product of X and Y. 

(a) Show that for each X, the map L x \ Y [. X , Y] is a linear map, satisfying 

the Leibniz rule for derivations, that is 

[X,[Y,Z]] = [[X, Y],Z\ + [Y, [X,Z]]. 

(b) Let D n be the vector space of diagonal matrices. For each H e D, show that 
Eg is an eigenvector of L H , with eigenvalue ag(H) = hi — hj (where h\ , . . . ,h n are 
the diagonal components of H). Show that a,;/: D — > F is linear. It is called an 

eigencharacter of the bracket or Lie action. 

(c) For two linear maps A, B of a vector space, define [A,B] = AB — BA. Show 
that L\x, y) = [Lx? Ly], so L is also a homomorphism for the Lie product. 


The next two chapters are logically independent. A reader interested first 
in field theory can omit the next chapter. 


CHAPTER VI 


Some Linear Groups 


VI, §1. THE GENERAL LINEAR GROUP 

The purpose of this first section is to make you think of multiplication 
of matrices in the context of group theory, and to work out basic exam- 
ples accordingly in the exercises. Except for the logic involved, this 
section could have been placed as exercises in Chapter I. 

Let R be any ring. We recall that the units of R are those elements 
ueR such that u has an inverse u~ l in R. By definition, the units in the 
ring M n (K) are the invertible matrices, that is the n x n matrices A which 
have an inverse A~ l . Such an inverse is an n x n matrix satisfying 

AA~ 1 = A~ l A = /„. 

The set of units in any ring is a group, and therefore the invertible n x n 
matrices form a group , which we denote by 

GL n (K). 

This group is called the general linear group over K. If you know about 
determinants, you know that A is invertible if and only if det(/4) / 0. 
Computing the determinant gives you an effective way of determining 
whether a matrix is invertible or not. 

VI, §1. EXERCISES 

1. Let AeGlufK) and CeK n . By the affine map determined by ( A , C) we mean 
the map 

Sax- K " - K “ such that /aAX) = AX + C. 
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(a) Show that the set of all affine maps is a group, called the affine group. 
We denote the affine group by G. 

(b) Show that GL W (/C) is a subgroup. Is GL n (K) a normal subgroup? Proof? 

(c) Let T c : K n -> K n be the map T C (X) = X + C. Such a map is called a 
translation. Show that the translations form a group, which is thus a 
subgroup of the affine group. Is the group of translations a normal sub- 
group of the affine group? Proof? 

(d) Show that the map f AC i— > A is a homomorphism of G onto GL n (K). 
What is its kernel? 

2. Determine the period of the following matrices: 


3. Let A , B be any n x n invertible matrices. Show that the periods of A and 
BAB~ l are the same. 

4. Let A be an n x n matrix. By an eigenvector X for A we mean an element 
XeK n such that there exists ceK satisfying AX =cX. If X ^ 0 then c is 
called an eigenvalue of A. 

(a) If X is an eigenvector for A with eigenvalue c, show that X is also an 
eigenvector for every power A n , where n is a positive integer. What is the 
eigenvalue of A n if A n X ^ 0? 

(b) Suppose that A has finite (multiplicative) period. Show that an eigen- 
value c is necessarily a root of unity, that is c n = 1 for some positive 
integer n. 

5. Show that the additive group of a field K is isomorphic to the multiplicative 
group of matrices of type 


is a homomorphism of G onto the product K* x K* (where K* is the 
multiplicative group of K). Describe the kernel. We could also view our 




6. Let G be the group of matrices 



with a , h, de K and ad ^ 0. Show that the map 
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homomorphism as being into the group of diagonal matrices 


a O' 
0 d 


which is isomorphic to K* x K*. 

7. A matrix NeM n (K) is called nilpotent if there exists a positive integer n such 
that N n = 0. If N is nilpotent, show that the matrix I + N is invertible. 
[Hint: Think of the geometric series.] 

8. (a) Let G = G 0 be the group of 3 x 3 upper triangular matrices in a field K , 

consisting of all invertible triangular matrices 


Show that G 1 is a subgroup of G, and that it is the kernel of the homo- 
morphism which to each triangular matrix T associates the diagonal 
matrix consisting of the diagonal elements of T. 

(b) Let G 2 be the set of matrices 



so «n« 22 a 33 ^ 0. Let G t be the set of matrices 




Show that G 2 is a subgroup of G 1 . 

(c) Generalize the above to the case of n x n matrices. 

(d) Show that the map 



is a homomorphism of the group Gj onto the direct product 


K x K = K 2 . 


What is the kernel? 

(e) Show that the group G 2 is isomorphic to K. 
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9. Let V be a vector space of dimension n over a field K. Let be a 

sequence of subspaces such that dim V t = i and such that V t <= + v Let 

A: V -> V be a linear map. We say that this sequence of subspaces is a fan 
for A if AV i a V r 

(a) Let G be the set of all invertible linear maps of V for which {V lf ...,V n } is 
a fan. Show that G is a group. 

(b) Let Gj be the subset of G consisting of all linear maps A such that 
Av = v for all ve V t . Show that G ( is a group. 

(c) By a fan basis we mean a basis {v v ...,v n } of V such that is a 

basis for V { . Describe the matrix associated with an element of G with 
respect to a fan basis. Also describe the matrix associated with an 
element of G { . 

10. Let F be a finite field with q elements. What is the order of the group of 
diagonal matrices: 


( a, 0 ... 0 \ 

* 2 ’ * ’ I with a u . . . ,a n eF , a t ^ 0 for all i 

0 0 ... aj 

11. Let F be a finite field with q elements. Let G be the group of upper tri- 
angular matrices 


with a u E F and a xl a 22 a 33 ^ 0. What is the order of G in each case (a) and 
(b)? 

12. Let F be a finite field with q elements. Show that the order of GL 2 (F) is 
q(q 2 - l)(q - 1). 

13. Let F be a finite field with q elements. Show that the order of GL W (F) is 


[Flint: Let {v u ...,v n } be a basis of F n . Any element of GL„(F) viewed as a 
linear map of F n into itself is determined by its effect on this basis (Theorem 
1.2 of Chapter V), and thus the order of GL„(F) is equal to the number of 
all possible bases. If AeGL n (F), let Av t = w f . For we can select any of 
the q n — 1 non-zero vectors in F". Suppose inductively that we have already 
chosen w l5 ...,w, with r < n. These vectors generate a subspace of dimension 
r which has q r elements. For + i we can select any of the q n — q r elements 
outside of this subspace. The formula drops out.] 



with a, deF, ad =£ 0 



n 


0 q" ~ 1 )(q n -q)--'(q"-q"- 1 )=q" in -' )l2 fl (</' -!)• 


i = 1 


236 


SOME LINEAR GROUPS 


[VI, §2] 


VI, §2. STRUCTURE OF GL 2 (F) 

Let 


be a 2 x 2 matrix with components in a field F. We define the determi- 
nant 

det(a) = ad — be. 

You probably have already met determinants, but we won’t use any 
properties which cannot be proved here directly by easy computations. 
In particular, by brute force, you can verify that if a, /? are two 2x2 
matrices, then 

det(a/?) = det(a) det(/?). 

Also verify: 

A 2 x 2 matrix a is invertible if and only i f det(a) # 0. 


To do this, simply solve for the inverse matrix: 

fa b\f x y\ = f 1 0\ 

\c d y\^z wj ^0 1 J 

You will get two systems of two linear equations in two unknowns, 
which can be solved precisely when ad — be / 0. 

Let G = GL 2 (F ) be the group of invertible 2x2 matrices in F. 

From the above, we see that 

det: GL 2 (F) -> F* 

is a homomorphism. We shall investigate its kernel in the next section. 
Here we note that this homomorphism is surjective, because the element 
aeF* is the image of the matrix 

a 0\ 

0 1 / 

Recall that for any group G, the center of G is the subgroup Z con- 
sisting of all elements yeG such that ya = ay for all aeG. 


Lemma 2.1. The center of GL 2 (F) is the group of scalar matrices 


a O' 
0 a 


with a e F*. 


[VI, §2] 


STRUCTURE OF GL 2 (F) 


237 


Proof. Each scalar matrix al commutes with all matrices. Conversely, 
if a commutes with all matrices, show that a is a scalar matrix. For 
instance, use the commutation with matrices like 




and so forth. 


We leave the details as an exercise. 


Let Z be the center of GL 2 (F). We define the projective linear group 
PGL 2 (F) = GL 2 (F)/Z = G/Z. 

Thus PGL 2 (F) is the factor group of GL 2 (F) by its center. 


The Bruhat decomposition 


We let the (standard) Borel subgroup B of GL 2 (F) be the group of all 
matrices 


with ad 7^ 0. 

0 dj 


Lemma 2.2. The Borel subgroup B is a maximal subgroup of G. That 
is , if H is a subgroup with B cz H cz G then H = B or H = G. 


The proof of this lemma will depend on an analysis of G as follows. 
Let 


T = 



We let BtB be the set of all elements oct/? with a, feB. 


Lemma 2.3. There is a decomposition 

G = Bkj BtB, 

and B, BtB have no elements in common. 


Proof. Let a, fe B. Then a direct computation shows that 
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where x / 0. Since for all cleB we have 



it follows that at/? cannot be an element of B and conversely no element 
of B can be in BtB. Hence B n BtB is empty. Furthermore, given yeG, 
write 



If c = 0 then yeR If c / 0 then for some xe K we get 

(\ x\/a i\ / 0 b'\ 

\0 d)^\c dj 

^ so /3eB. Then 



Since t 2 = —I we get t _1 = — t, and — IeB. Hence 

ye p~ 1 x~ 1 B a BzB 

thus proving that G = Bkj BtB. This concludes the proof of Lemma 2.3. 



Now for the proof of Lemma 2.2 that B is a maximal subgroup of G. 
Let Bc//cG and B / H. By Lemma 2.3 there exists an element ye// 
such that 

y = at/?, with a , PeB. 

Since H contains £, it follows that H also contains 


ByB = BtB . 

Hence H = G by Lemma 2.3. This concludes the proof of Lemma 2.2. 


VI, §2. EXERCISES 

1. Let F be a finite field. What is the order of PGL 2 (F)? 

2. Show that the center of GL„(F) is the group of scalar non-zero matrices. 

3. Let Z be the center of GL„(F). What is the order of PGL„(F), which is 
defined to be GL„(F)/Z? 
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4. Let F = F 2 = Z/2Z be the field with 2 elements. The group GL 2 (F) is 
isomorphic to some group which you already have encountered. Which one? 
What about PGL 2 (F)? 

5. Let F = F 3 = Z/3Z. The group PGL 2 (F) is isomorphic to some group which 
you already have encountered. Which one? 

6. Let C be a primitive n - th root of unity, for instance ( = e 2niln in the complex 
numbers. Let G be the subgroup of all 2x2 matrices generated by the 
matrices 


(° t l ' ) 

and 

z-(‘ 

0 ,j 

V V 


Vo 

r‘J 


Show that G has order 2 n. What is a za ! ? 

7. Let ( be a primitive n - th root of unity where n is an odd integer. Let G be 
the subgroup of all 2x2 matrices generated by the matrices 


w = 



and 



0 

r 


Show that G has order 4 n 


What is wzw 


i? 


VI, §3. SL 2 (F) 

We define SL 2 (F) to be the kernel of the determinant map 

det: GL 2 (F) -> F*. 


Thus SL 2 (F) consists of the matrices with determinant 1. It is a normal 
subgroup of GL 2 (F). The S stands for “special” and SL 2 (F) is called the 

special linear group. 

Lemma 3.1. The center of SL 2 (F) is ± /. 

This is proved just as for the center of GL 2 (F), using the commuta- 
tion rule with special matrices like 


1 x 
0 1 


or 


1 0 

y i 


We leave the computation to the reader. 


We let PSL 2 (F) = SL 2 (F)/Z where Z is the center of SL 2 (F). We call 
PSL 2 (F) the projective special linear group. The main result of this sec- 
tion will be that if F has at least four elements, then PSL 2 (F) is a simple 
group. 
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Lemma 3.2. For x, yeF we let 



and 



Then the set of matrices u(x) and v(y) for x, ye F generate SL 2 (F). 

Proof Multiplication on the left by u(x) adds x times the second row 
to the first row. Multiplication of u(x) on the right adds x times the first 
column to the second column. And similarly for multiplication with t;(y). 
Thus multiplication with elements u(x) and t;(y) carries out row and 
column operations. By such multiplications, a given matrix in SL 2 (F) 
can be brought to diagonal form, that is 


thus concluding the proof that the elements w(x), t;(y) generate SL 2 (F). 

As for GL 2 (F), we let B s be the Borel subgroup of SL 2 (F), that is B s 
is the group of matrices 


Lemma 3.3. SL 2 (F) = u B s rB s , and B s , B s rB s are disjoint . 

Proof The proof is similar to that of Lemma 2.3 and will be left to 
the reader. 

Lemma 3.4. The Borel subgroup B s is a maximal subgroup of SL 2 (F). 
Proof Same as Lemma 2.2. 

Lemma 3.5. The intersection of all subgroups conjugate to B s , that is 
all subgroups olB s ol ~ 1 with oleSL 2 (F ), is the center of SL 2 (F). 



and ad = 1, so d = a 1 . Let w(a) = u(a)v(— a 1 ). Then 




with aeF* and beF. Or in other words, 


B s = B n SL 2 (F). 
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Proof. Note that 



is an element of SL 2 (F). By a direct computation, you can see that 

xBx~ 1 = B 


is the group of lower triangular matrices 



and therefore that 


B s n tF s t 1 c= subgroup of matrices in SL 2 (F) which are 
both upper and lower triangular. 


A matrix in SL 2 (F) which is both upper and lower triangular has the 
form 

a 0 \ 

0 a' 1 )' 


If we conjugate such a matrix by | ^ j ) we get 


1 -1\/a 0 

0 l/U a" 1 


1 1 
0 1 


a a — a 


If such a matrix lies in the intersection of all conjugates of B s then we 
must have a — a~ l = 0 so a = a~ l and a 2 = 1. This implies that the 
matrix is ± 1, thus proving the lemma. 


Let G be a group. By the commutator group G c we mean the group 
generated by all elements of the form 

with ol,PeG. 

Lemma 3.6. If F has at least four elements, then SL 2 (F) is equal to its 
own commutator group. 


Proof. Let 


s<o) = (o .-■) 


for qeF*- 
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We have the commutator relation 

s(a)u(b)s(a)~ l u(b)~ 1 = u(ba 2 — b) = u(b(a 2 — 1)) 

for all aeF* and beF. Let G = SL 2 (F). Let G c be its commutator 
group, and let B c s be the commutator group of B s . From the hypothesis 
that F has at least four elements, we can find an element a / 0 in F 
such that a 2 / 1, whence the commutator relation shows that B c s is the 
group of all matrices 


with beF. 


Denote this group by U. It follows that G c U, and since G c is normal 
(prove this as an exercise), we get 

G c =5 il/t~ 1 = 0 , 


where U is the group of all matrices 



From Lemma 3.2 we conclude that 


with ceF. 

G c = G, thus proving Lemma 3.6. 


Lemma 3.7. Let G = SL 2 (F). Let H be a normal subgroup of G. Then 
either H a Z ( where Z is the center ) or H G c . 

Proof Write B instead of B s for simplicity. By the maximality of B 
we must have 


HB = B or HB = G. 

If HB = B then H c: B. Since H is normal, we conclude that H is con- 
tained in every conjugate of £, whence in the center by Lemma 3.5. On 
the other hand, suppose that HB = G. Write 

r = hfi with heH and fieB. 

Then 

tGt“ 1 = 0 = hfUp~ 1 h~ 1 = hUh- 1 HU 

because H is normal, so HU = UH. Since U <= HU and U, U generate 
G by Lemma 3.2, it follows that HU = G. Let 


/: G = HU -> G/H = HU/H 
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be the canonical homomorphism. The f(h) = 1 for all heH . Since U is 
commutative, it follows that /(G) = f(U) and that G/H is a homo- 
morphic image of the commutative group U, whence G/H is abelian. 
This implies that H contains G c , thus proving the lemma. 

Theorem 3.8. Let F be afield with at least four elements. Let Z be the 
center of SL 2 (F). Then SL 2 (F)/Z is simple. 

Proof Let G = SL 2 (F). Let 


g: G -> G/Z 

be the canonical homomorphism. Let H be a normal subgroup of G/Z 
and let 

H = g~\H). 

Then H is a subgroup of G which contains the center Z. If H = Z then 
H is just the unit element of G/Z. If H / Z then H = G by Lemma 3.7 
and Lemma 3.6 which says that G c = G. Hence H = G/Z. This con- 
cludes the proof. 


VI, §3. EXERCISES 


Throughout Exercises 1, 2, 3, we let G = SL 2 (R) where R is the field of real 
numbers. 

1. Let H be the upper half plane, that is the set of all complex numbers 

z = x + iy 


with y > 0. Let 


Define 



eG. 


a(z) = 


az + b 
cz + d 


Prove by explicit computation that a(z)eH and that: 

(a) If a, PeG then cc(P(z)) = (a /?)(z). 

(b) If a = ±1 then a(z) = z. 

In other words, we have defined an operation of SL 2 (R) on H, according to 
the definition of Chapter II, §8. 
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2. Given a real number 0, let 


/ cos 6 sin 0\ 
r(0) = (-sin0 cos 0 J 

(a) Show that 6 i— ► r(0) is a homomorphism of R into G. We denote by K 
the set of all such matrices r(0). So K is a subgroup of G. 

(b) Show that if a = r(0) then a(i) = i, where i = 

(c) Show that if oceG, and ct(i) = i then there is some 6 such that a = r(0). 

In the terminology of the operation of a group, we note that K is the isotropy 
group of / in G. 

3. Let A be the subgroup of G consisting of all matrices 


5(a) = 



with a > 0. 


(a) Show that the map a \— ► s(a ) is a homomorphism of R + into G. Since this 
homomorphism is obviously injective, this homomorphism gives an imbed- 
ding of R + into G. 

(b) Let U be the subgroup of G consisting of all elements 


u(x) = 


with xeR. 


Thus u \— ► u(x) gives an imbedding of R into G. Then U A is a subset of G. 
Show that U A is a subgroup. How does it differ from the Borel subgroup 
of G? Show that U is normal in UA. 

(c) Show that the map 


given by 


UA -•> H 

p »- m 


gives a bijection of UA onto H. 

(d) Show that every element of SL 2 (R) admits a unique expression as a 
product 

u(x)s(a)r(6\ 


so in particular, G = UAK . 

4. Let G = GL 2 (F) where F = Z/3Z, and let V = F x F be the vector space of 

pairs of elements of F, having dimension 2 over F. 

(a) Show that G operates as a permutation group of the subspaces of V of 
dimension 1. How many such subspaces are there? 

(b) From (a), establish an isomorphism Gj ± 1 % S 4 (where S 4 is the symmetric 
group on 4 elements). 

(c) Establish an isomorphism SL 2 (Z/3Z)/±1 % A 4 , where A 4 is the alternating 
subgroup of S 4 . 


[VI, §4] 


SL„(R) AND SL„(C) IWASAWA DECOMPOSITIONS 


245 


5. Let F be a finite fi eld of characteristic p. Let U be the subgroup of GL„(F) con- 
sisting of upper triangular matrices whose diagonal elements are all equal to 1. 
Prove that U is a p - Sylow subgroup of GL„(F). 

6. Again let F be a finite field of characteristic p. Prove that the p - Sylow subgroups 
of SL„(F) and GL„(F) are the same. 

7. Let R be a principal ring. By GL„(R) we mean the set of matrices with com- 
ponents in R such that the determinant is a unit in R. We assume that you 
know determinants from a course in linear algebra. Show that GL n (R) is a 
group. 

8. Let (xi i , . . . , x\ n ) be an rc-tuple of elements in a principal ring R, and assume that 
they are relatively prime, that is, the ideal generated by them in R is R itself (the 
unit ideal). Show that there exist elements xy (/, j = 1 such that the ma- 
trix X = (Xy) is in GL„(R). Do this by induction, starting with n = 2. 

9. Let SL „(R) be the subset of GL„(R) consisting of those matrices with determi- 
nant 1. Show that SL n (R) is a subgroup. 

10. Let F be the quotient field of the principal ring R. Let B„(F) be the subset of 
GL„(F) consisting of the upper triangular matrices (arbitrary on the diagonal, 
but non-zero determinant). Show by induction that 

GL„(F) = SL n (R)B n (F). 

[Hint: First do the case n = 2 using Exercise 8. Next let n > 2. Let X = (xy) be 
an unknown matrix in SL„(R), and let X n be its bottom row. Let A 1 , . . . ,A n be 
the columns of a given matrix A e GL„(F). We want to solve for X„ A 1 = 0 for 

i = 1, , n — 1, so that XA has its last row equal to 0 except for the lower right 

comer. Consider the R-module consisting of R-vectors X n satisfying these ortho- 
gonality relations. It has a non-zero element, and by unique factorization it has 
an element X n whose components are relatively prime. Use Exercise 8. For some 
A! g GL„_i(F), XA is a matrix with 0 in the bottom row except for the lower 
right hand corner, which is a unit in R. Use induction again to get a matrix 
Y e SL„_i (R) such that YA' is upper triangular. Conclude.] 
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Let F be a field. By SL n (F) we mean the group of n x n matrices with com- 
ponents in F, having determinant 1. We shall give decompositions valid 
over the real and complex numbers in terms of special subgroups. 

Let G = SL„(R). Note that the subset consisting of the two elements 
/, —I is a subgroup. Also note that SL„(R) is a subgroup of the group 
GL„(R) (all real matrices with non-zero determinant). 

Let: 
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U = subgroup of upper triangular matrices with 1 ’s on the diagonal, 

1 *12 ••• -Vln\ 

0 1 ■■■ X 2 „ 


u(x) = 


called unipotent. 


\0 o ... 1 / 

A == subgroup of positive diagonal elements: 

(a x \ 

a 2 

with Mi > 0 for all i. 


\ 


a n 


K = subgroup of real unitary matrices k , satisfying l k — k 1 . 

Theorem 4.1 (Iwasawa decomposition). The product mMp U x A x K — > G 


(w, je, fc) i— ► uak 


is m bijection. 

Proof. Let e \ , . . . , e n be the standard unit vectors of R" (vertical). Let 
g = ( gij ) e G. Then we have 


m 


( 9\\ 9\n\ 


\9n\ • • • 9 tin / 


/ 0 \ 


1 / 


\0/ 


9\i 


, 9ni j 


= d {i) = Y,g qi e 




There exists an upper triangular matrix B — (by), so with by — 0 if i > j, 
such that 


6u0 (1) 

= e> \ 

b\ 2 g [x) + b 22 g {2) 

= e 2 

byg w + b 2j g (2) H + b M g^ 

= e ', 


-L. h 

v Vnn9 — e n , 


bi n g w + b 2 „g {2) + 
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such that the diagonal elements are positive, that is b \\ , . . . ,b m > 0, and 
such that the vectors e {, . . . , e' n are mutually perpendicular unit vectors. Get- 
ting such a matrix B is merely applying the usual Gram Schmidt orthogon- 
alization process, subtracting a linear combination of previous vectors to get 
orthogonality, and then dividing by the norms to get unit vectors. Thus 

e 'j = Y h ‘j gU) = YY = YY g ‘i‘ b u e r 

i= 1 i = 1 <7=1 <7=1 /= 1 

Let gB = k e K. Then kei = e\, so k maps the orthogonal unit vectors 
eu . . . , e„ to the orthogonal unit vectors ej , . . . , e'. Therefore k is unitary, 
andg = kB~ x . Then 


g 1 = Bk 1 and B = mu 

where « is the diagonal matrix with = bn and u is unipotent, u = « _1 B. 
This proves the surjection G = UAK. For uniqueness of the decomposition, 
if g = u*k = u't'k', let u\ =u~ l u\ so using g l g you get =U[*' 2 . 

These matrices are lower and upper triangular respectively, with diagonals 
« 2 ,*' 2 , so * = and finally u\ = /, proving uniqueness. 

The elements of U are called unipotent because they are of the form 

u(X) =I + X, 

where X is strictly upper triangular, and X n+x = 0. Thus X = u - I is called 

nilpotent. Let 


00 yj yi 

ex P r = £— and log^ + A') = — . 

j= 0 i=l 1 

Let n denote the space of all strictly upper triangular matrices. Then 
exp: n — > U, Y ^ exp 7 

is a bijection, whose inverse is given by the log series, Y = log (/ + X). Note 
that, because of the nilpotency, the exp and log series are actually polyno- 
mials, defining inverse polynomial mappings between U and n. The bijec- 
tion actually holds over any field of characteristic 0. The relations 

explog(7 + X) = I + X and log exp Y = log(7 -f - X) = Y 

hold as identities of formal power series. Cf. my Complex Analysis, Chapter 
II, §3, Exercise 2. 


248 


SOME LINEAR GROUPS 


[VI, §4] 


We now recall Chapter II, §8. 

Let X be a set. A bijective map a: X —> X of X with itself is called a per- 
mutation. You can verify at once that the set of permutations of X is a 
group, denoted by Perm(Y). By an action of a group G on X we mean a 
map 


satisfying the two properties: 

If e is the unit element of G, then ex — x for all x e X. 

For all 0), 02 e G and x e X we have 01 (02.x) = (0i02)x. 

This is just a general formulation of action, of which we have seen an exam- 
ple above. Given g e G, the map x i— > gx of X into itself is a permutation of 
X. You can verify this directly from the definition, namely the inverse per- 
mutation is given by x i— > g~ l x. Let a (g) denote the permutation associated 
with g. Then you can also verify directly from the definition that 


is a homomorphism of G into the group of permutations of X. Conversely, 
such a homomorphism gives rise to an action of G on X. 

Let x e X. By the isotropy group of x (in G) we mean the subset of ele- 
ments g e G such that gx = x. This subset is immediately verified to be a 
subgroup, denoted by G x . 

Geometric interpretation in dimension 2 

Let H 2 be the upper half plane of complex numbers z = x + \y with x,y e R 
and y > 0, y = y(z). For 


It is routinely verified by computation that this defines an action of G on the 
upper half plane. Also note the property 

If g(z) = z for all zeH 2 then g — ±1. 

In other words, the kernel of the action homomorphism G — ► Aut(H 2 ) is the 
subgroup {+/}. 


GxI^I denoted by (g, x) i-> gx , 


9 *-> a(g) 



define 


0(2) = (az + b)(cz + d) \ 
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To see that if z e H 2 then g(z) e H 2 also, you will need to check the trans- 
formation formula 



These statements are proved by easy brute force. 

Furthermore, by direct computation, you can verify: 

Theorem 4.2. The isotropy group ofi is K , i. e . , K is the subgroup of ele- 
ments k e G such that k(i) — i. This is the group of matrices of the form 


If g — uak , then u(x)(z) = z + x, so putting y — a \ , we get a(i) = yi, 

g(i) = uak(i) = ua( i) = y\ + x = x + \y. 

Thus G acts transitively, and we have a description of the action in terms of 
the Iwasawa decomposition and the coordinates of the upper half plane. 

Geometric interpretation in dimension 3 

We shall use the quaternions, whose elements are linear combinations 


and i 2 = j 2 = k 2 = — 1, ij = k, jk = i, kl = j; also xeR commutes with 
i, j,k. Define 


cos 9 sin 9 
—sin 9 cos 9 


Or equivalently , a — d, c = —b, a 2 + b 2 = 1. 


For x e R and a\ > 0, let 



and 



z = xi + X21 4 - X3j + X4k with xi,X2,X3,X4 e R 


= X\ — X2l — X3j — X4k. 


Then 


ZZ = x\ + x\ + x\ + *4, 


and we define \z\ = (zz) 1 ^ 2 . 
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Let H 3 be the upper half space consisting of elements z whose k-compo- 
nent is 0 , and X 3 > 0 , so we write 

z = x\ + X2I + y\ with y > 0 . 

Let G = SL 2 (C), so elements of G are matrices 

g = (* b ) with d e C and md — be = 1 . 

\c d ) 

As in the case of H 2 , define 


g(z) = (mz + b){cz + d) l . 

Verify by brute force that if z e H 3 then g(z) e H 3 , and that G acts on H 3 , 
namely the two properties listed in the previous example are also satisfied 
here. Since the quaternions are not commutative, we have to use the quo- 
tient as written (mz + b)(cz + d)~ l . Also note that the y-coordinate transfor- 
mation formula for zeH 3 reads the same as for H 2 , namely 


y(g(z)) = 


y(z) 

cz + d \ 2 


For the Iwasawa decomposition, we use the groups: 

U — group of elements u(x) = ^ ^ with x e C; 

A = smme group ms before in the case of SL 2 (R); 

K = complex unitary group of elements k such that l k — k~ l . 

The group G — SL2(C) hms the Iwmsmwm decomposition G = UAK. Emch 
element of G hms m unique decomposition g = umk with u e U, m e A, k e K. 

The previous proof works the same way, BUT you can verif y directly: 

Theorem 4.3. The isotropy group G\ is K. 

If g = umk with u e U, m e A, k e K, u = u(x) mnd y = y(m), then 

g(i) = x + yj. 


Thus G acts transitively, and the Iwasawa decomposition also follows trivi- 
ally from this group action (see below). Thus the orthogonalization-type 
proof can be completely avoided. Of course, it can be similarly avoided for 
H 2 and SL2(R), using x e R. 

Proof of the Iwmsmwm decomposition from the mbove two properties. Let 
g e G and g( j) = x + y\. Let u = u(x) and m be such that y = mi/m 2 = m\. 
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Let g' ~ ua. Then by the second property, we get g(\) = ^'(j), so 
j = g~ l g'()). By the first property, we get g~ l g' = k for some k e K, so 

g'k~ l = uak~ ] = g, 

concluding the proof. 


We now come to the general case of SL„(C). 

Let G = SL„(C). Let: 

U — U(C) be the subgroup of elements / + Z such that Z is strictly upper 
triangular with components in C. 

D = subgroup of diagonal matrices. 

A = same A as for SL„(R), namely the subgroup of diagonal matrices 
with positive diagonal elements. 

K = subgroup of elements k such that l k — . Then K is also called the 

(complex) unitary subgroup. Its elements are called unitary. 


For g e G, define 


9 = 9- 


You can verify directly and easily that 


{mi)* =Qi9\- 


Note that K is the subgroup of G consisting of those elements k such that 

k* =kr x . 

Let <-,•> be the standard hermitian scalar product on C", that is: if 

ze C", 


z = l (zi, . . . ,z„) and w = '(wi, . . . , w„) with z,-, w, eC, 


then 

n 

(z, W> = ^2 z i™>- 
i=l 


The element g * satisfies 

<gz, w) = <z, g*w} for all z, w e 

so it is called the adjoint (which you should know from a course in linear 
algebra). 

We also have the Iwasawa decomposition, with the same A as in the real 
case , but complexified U and K , as follows. 

Theorem 4.4. Let G = SL„(C). Then with U,A,K defined as above , the 
product map 
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U xAxK UAK = G 


gives a bijection with G. 

The proof is the same as in the real case, using orthogonalization with re- 
spect to the hermitian product. 


VI, §5. OTHER DECOMPOSITIONS 

The star operation is of course defined for all matrices. A complex n x n ma- 
trix Z is called hermitian if and only if Z = Z*. 

Let G = SL„(C). We define the quadratic map on G by 

9 ^ 99*- 

Then gg* is hermitian positive definite, i.e., for every veC n , we have 
(gg*v, v) ^ 0, and = 0 only if v = 0. 

We denote by P = SPos„(C) the set of all hermitian positive definite n x n 
matrices with determinant 1 . 

Theorem 5.1. The set P = SPos w (C) is the set of elements kak~ x with a e A 

and k e K, in other words , it is c(K)A ( image of A under K-conjugation). 

Proof We use a fact which you should have learned in a course on linear 
algebra, that a hermitian matrix can be diagonalized. In fact, if g is hermi- 
tian, then there exists a basis consisting of orthogonal unit vectors such that 
each basis vector is an eigenvector of g. An equivalent way of putting this is 
that there exists a unitary matrix k such that kgk~ x is diagonal. Since p is 
assumed positive definite, putting g = p, it follows that the diagonal ele- 
ments of kpk~ x are real positive. Conversely, let * e A, k e K. Write 
* = b 2 with be A. Then for any vector v, 

(kb 2 k~ l v,v} = (bk~ x v,bk~ l v') > 0, 

which proves the positive definiteness, and the theorem. 

Theorem 5.2. Let p , q be hermitian and commute. Then there is a basis of 

C n consisting of vectors which are also eigenvectors for both p and q , i.e., 

p , q can be simultaneously diagonalized. 

Proof Let E = C n . There exists a basis of E consisting of p-eigen vectors. 
For each eigenvalue c of p , let E c (p) be the c-e igenspace, that is, the sub- 
space of eigenvectors with eigenvalue c. Then E is the direct sum of the sub- 
spaces E c (p ), taken over the eigenvalues c (/ 0 since p is invertible). Di- 
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rectly from the commutativity pq — qp , we see that q maps an eigenspace 
E c (p) into itself, namely 

if pv — cv then pqv — qpv = qcv — cqv. 

Since q is hermitian, there is a basis {wi, . . . , w r } of E c (p) consisting of ei- 
genvectors of q. Thus p , q can be simultaneously diagonalized, as desired. 
By orthogonalization and multiplication by positive scalars, one can of 
course achieve that the basis is orthonormal. 

Corollary 5.3. Let p be hermitian positive definite. Then p has a unique 
square root in P. 

Proof. We write E — @E c (p) as a sum of eigenspaces as in the theorem. 
Let q 2 = p , with q e P. Since q maps E c (p) into itself, it suffices to prove 
that its restriction to E c (p) is unique. Let {v\, . . . , v m } be a basis of E c (p) 
consisting of eigenvectors for q , so pv t — cv t and qv t = biVt for all i. By the 
positive definiteness, we have c, b t real 0. From q 2 = p we get b 2 = c, 
whence b t is the unique positive square root of c, thus proving the corollary. 

Theorem 5.4 (Cartan Decomposition). The quadratic map g i— > gg * induces 
a bijection 

G/K SPos„(C) = P. 

In fact , we have a product decomposition G — P K, each g e G having a 
unique product decomposition g — pk with p e P and k e K. 

Proof Since by Corollary 5.3, every p e P can be written as q 2 with 
q e P, the quadratic map is surjective. For injectivity on G/K , suppose that 

9i9\ = 9i9i' 

Then 9i9\ — 9i(9\ 1 )* = Hence g/ l g 2 so g 2 e g\K, which 

proves the injectivity on G/K. By Theorem 5.1, given geG there exists 
k e K and b 2 e A such that 

gg * =kb 2 k~ { =(kbk^) 2 . 

Let p — kbkr x . Then p e P and pp * = gg *, whence by the first part of the 
theorem, pK — gK, which proves G = PK. For the uniqueness, suppose 
P\k\ — pik 2 . Taking the quadratic map yields p 2 — p 2 , and Corollary 5.3 
shows p\ — p 2 , whence k\ — k 2 , thus proving the theorem. 

Theorem 5.5. The group G has the decomposition called polar 


G - KAK. 
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If g e G is written ms m product g = k\bki with k\,ki e K and b e A, then b 
is uniquely determined up to m permutMtion of the diMgonMl elements. 

Proof From G = P K (Theorem 5.4) and P = c(K)A (Corollary 5.3), we 
get G = KAK. Then gg* = k\b 2 k [ x . Hence the roots of the characteristic 
polynomial of gg * are the diagonal elements b \ , . . . , b 2 , uniquely determined 
up to a permutation. Then b \, . . . , b n are also uniquely determined up to a 
permutation, as was to be shown. 

The linear structure of the permutations of the diagonal elements will be 
further analyzed in the next section (Theorems 6.1 and 6.2). 

VI, §5. EXERCISES 

1 . Verify in detail that Theorems 5.1, 5.2, 5.3, 5.4 are valid for SL„(R), with exactly 
the same proofs, replacing “hermitian” by “symmetric” and using SPos„(R). For 
further results, cf. for instance J. Jorgenson and S. Lang, Spherical Inversion on 
SL„(R), Chapter I and Chapter V, §2 (the Bruhat decomposition). That book 
gives an exposition on SL„ of what is usually regarded as more advanced topics. 
One sees how some algebraic structures are used as backdrop for analysis on cer- 
tain types of groups. 

2. Let Her = Her„(C) be the real vector space of hermitian n x n matrices, and 
Sk = Sk n (C) the real vector space of skew hermitian matrices, that is, matrices Z 
satisfying r Z = — Z. Show that Mat„(C) is the direct sum Her + Sk. 

3. Let L be hermitian, and let v,v' be eigenvectors with distinct eigenvalues for L. 
Show that v, v' are orthogonal. 


VI, §6. THE CONJUGATION ACTION 

Let G be a group. The conjugation action of G on itself is defined for 
g,g' e G by 


c (g)g' = gg'g 1 


It is immediately verified that the map g (-> c (g) is a homomorphism of G 
into Aut(G) (the group of automorphisms of G). Then G also acts on spaces 
naturally associated to G. We describe such spaces. 

Consider the special case when G = SL n (F) with a field F. Let: 

b = vector space of diagonal matrices diag(/u, . . . ,A„) with trace 0, 
£ hi = o. 

n = vector space of strictly upper triangular matrices (hy) with hy = 0 if 

i ^ j- 

'n = vector space of strictly lower triangular matrices, 
g = vector space of n x n matrices in Mat„(F) with trace 0. 
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To denote the dependence on F , we may write b(/ 7 ), n(F), t n(F) y and 
a(F)=sln(F)- 

We shall now use the diagonal group: 

D = group of diagonal matrices in F with determinant 1 . 

Then g i s the direct sum, 

(1) g = b + n + t n. 

Furthermore, D acts by conjugation on g. One can then decompose g into a 
direct sum of eigenspaces for this action as follows. Let oty: D — ► F* be the 
homomorphism (called a character) defined on a diagonal element 
a = diag(ai , ... ,a n ) by 


a 9 * = cii/dj. 

Let Ejj (i < j) be the matrix with zy-component 1 and all other components 
0. Then for a e D, we have 


C (a)Eij = aEija 1 = {ai/a^Ey = a** Eg 

by direct computation. Thus a ,y (written exponentially) is a homomor- 
phism of D into F*. The set of such homomorphisms will be called the set 
of regular characters, denoted by ^?(n). Note that n is the direct sum of the 
1 -dimensional eigenspaces having basis Etj (z < j). We write 

(2) n= 0 n a = 0 FE«, 

a6^(n) ae^(n) 

where n a is the subspace of elements Jen such that aXa 1 = a* X. We 
have similarly 

(3) 'n = 0('n)_ a . 

a 

Note that b is the 1-eigenspace for the conjugation action of D. 

If F = R or C, then instead of D and b, we may consider only the group A 
of diagonal matrices with positive diagonal elements, and a the real vector 
space of real matrices with trace 0. The regular characters are given by the 
same formula as above. 

We have expressed g as a direct sum of eigenspaces. Over the arbi- 
trary field F, these eigenspaces can be taken to have dimension 1. This is 
clear from (2) and (3) for n and *n. For b itself, basis elements of any basis 
will do. The usual, and most natural basis, consists of the elements 
,Hn-\) where 


Hi = diag(0, . . . , 0, 1 ? , — l,+i , 0, . . . , 0). 
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By an algebra we mean a vector space with a bilinear map into itself, 
called a product. We make g into an algebra by defining the Lie product of 
X, Y e g to be 


[X, Y]=XY- YX. 


It is immediately verified that this product is bilinear but not associative. We 
call g the Lie algebra of G. Let the space of linear maps of g into itself be 
denoted by End(g), whose elements are called endomorphisms of g. By defi- 
nition the regular representation of g on itself is the map 

g End(g) 

which to each X e g associates the endomorphism L(X) of g such that 

L(X)(Y) = [X, Y). 

Note that X i— > L(X) is a linear map. We also write L x for L(X). 

Verify the derivation property for all Y, Z e g, 


L X [Y,Z] = [L X Y,Z} + [Y,L X Z}. 


Using only the bracket notation, this looks like 

[X,[Y,Z)\ = [[X, Y],Z] + [Y, [X, Z}\. 

We use a also to denote the additive linear character on b given on a di- 
agonal matrix H = diag (h\, ... ,h n ) by 

a ij(H) = hi - hj so %: b — > F. 

This is the additive version of the multiplicative character previously consid- 
ered multiplicatively on D. Then each n a is also the a-eigenspace for the ad- 
ditive character a, namely for H e b, we have 

[H,E a ] =0L(H)E a , 

which you can verify at once from the definition of multiplication of ma- 
trices. 

We let Nor(D) be the normalizer of D in G, that is, the subgroup of ele- 
ments g e G such that gDg~ x = D. We let Cen(D) be the centralizer of D , 
i.e. g e G such that gzg~ l = z for all z e D. Verify that Cen(Z)) is normal in 
Nor(Z)). 
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Theorem 6.1. For g e Nor(D), conjugation c (g) restricted to D permutes 
the diagonal elements. The map g i— » c (g) induces an isomorphism of 
Nor(D)/Cen(D) with the permutation group of the diagonal. 

Proof The characteristic polynomial is P(t) = J] (t ~ z i)> where z \ , . . . , z n 
are the diagonal elements. It is unchanged under conjugation, so we get an 
injection of Nor(D)/Cen(D) into the permutation group. To show the sur- 
jectivity, you have to prove that given a permutation, there is g e Nor (D) 
such that c (g) induces the permutation. Do this as an exercise. 

Actually, you may want to work out Theorem 6. 1 first in the case F = R, 
G = SL„(R), in which case we can use A instead of D. Thus we now let M' 
be the subgroup of K consisting of the matrices which normalize A, that is k 
such that kAk~ x = A. Then we let M be the centralizer of A in K. The 
group Af'/Af is called the Weyl group. 

Theorem 6.2. Elements of the Weyl group act as a group of permutations 
of the diagonal elements. For k e Af ', let w k be the corresponding permuta- 
tion. Then k > w k gives an isomorphism 

W = Af'/Af Permutation group of the diagonal. 


Proof Exercise. 

You will find that the “permutation matrices” have components which 
are 0 or ±1, and can then be used over any field F of any characteristic. 
Check the case n = 2 first. You can see the exercise worked out in Jorgen- 
son-Lang, Spherical Inversion on SL„(R), Springer Verlag 2001, Chapter III, 
Theorem 3.4. 


VI, §6. EXERCISES 

1. A diagonal n x n matrix H is called regular if its diagonal elements are dis- 
tinct. Let X e Mat„(F) (with a field F). If X commutes with one regular H , 
show that X is diagonal. Note that H is regular if and only if cl(H) # 0 for all reg- 
ular characters a. 

2. Let F = R, and K the real unitary subgroup of SL„(R). Let M be the subgroup of 
K consisting of elements commuting with all elements of a, that is, the centralizer 
of a in K. Then M consists of the diagonal matrices with ±1 on the diagonal. 

3. Prove Theorems 6.1 and 6.2. Do it first when n = 2. Then for arbitrary n , deter- 
mine the matrix inducing the transposition of h t and hi+ \ in a diagonal matrix 
diag(fii, . . . ,h n ). 


CHAPTER VII 


Field Theory 


VII, §1. ALGEBRAIC EXTENSIONS 

Let F be a subfield of a field E. We also say that E is an extension of F 
and we denote the extension by E/F. Let F be a field. An element a in 
some extension of F is said to be algebraic over F if there exists a 
non-zero polynomial / e F[t] such that /(a) = 0, i.e. if y. satisfies a 
polynomial equation 


a n ct n + • • • 4- ciq = 0 

with coefficients in F, not all 0. If F is a subfield of F, and every 
element of E is algebraic over F, we say that E is algebraic over F. 

Example 1. If a 2 = 2, i.e. if a is one of the two possible square roots 
of 2, then a is algebraic over the rational numbers Q. Similarly, a cube 
root of 2 is algebraic. Any one of the numbers e 2ni,n (with n integer ^ 1) 
is algebraic over Q, since it is a root of t n — 1. It is known (but hard to 
prove) that neither e nor n is algebraic over Q. 

Let E be an extension of F. We may view E as a vector space over F. 
We shall say that E is a finite extension if E is a finite dimensional vector 
space over F. 

Example 2. C is a finite extension of R, and { 1 , i } is a basis of C over 
R. The real numbers are not a finite extension of Q. 

Theorem 1.1. If E is a finite extension of F, then every element of E is 
algebraic over F. 
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Proof. The powers of an element a of E , namely 1, cl, cl 2 ,... ,a" cannot 
be linearly independent over F, if n > dim E. Hence there exist elements 
a 0 ,...,a n eF not all 0 such that a n <x n . + • • • + a 0 = 0. This means that a is 
algebraic over F. 

Proposition 1.2. Let cl be algebraic over F. Let J be the ideal of 
polynomials in F[r] of which cl is a root , i.e. polynomials f such that 
f(ot) = 0. Let p(t) be a generator of J, with leading coefficient 1. Then 
p is irreducible. 

Proof. Suppose that p — gh with deg g < deg p and deg h < deg p. 
Since p( a) = 0, we have g(a) = 0 or /i(a) = 0. Say g(a) = 0. Since 
deg g < deg p, this is impossible, by the assumption on p. 

The irreducible polynomial p (with leading coefficient 1) is uniquely 
determined by cl in F[r], and will be called the irreducible polynomial of 
cl over F. Its degree will be called the degree of a over F. We shall 
immediately give another interpretation for this degree. 

Theorem 1.3. Let cl be algebraic over F. Let n be the degree of its irre- 
ducible polynomial over F. Then the vector space generated over F by 
1, cl, . ..,a" _1 is a field , and the dimension of that vector space is n. 

Proof. Let / be any polynomial in F[r]. We can find q , reF[r] such 
that / = qp + r, and degr < degp. Then 

/(a) = q(cL)p(cc) + r(a) = r( a). 

Hence if we denote the vector space generated by 1, a, ...,a n_1 by F, we 
find that the product of two elements of E is again in E. Suppose that 
/ (a) ^ 0. Then / is not divisible by p. Hence there exist polynomials 
g , heF[t] such that 


gf +hp = 1 . 

We obtain g(cL)f(ct) + h(ct)p{cL ) = 1, whence p(a)/(a) = 1. Thus every non- 
zero element of E is invertible, and hence E is a field. 

The field generated by the powers of a over F as in Theorem 1.3 will 
be denoted by F(a). 

If E is a finite extension of F, we denote by 

iE-n 

the dimension of E viewed as vector space over F, and call it the degree 
of E over F. 
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Remark. If [F : F] = 1 then E = F. Proof? 

Theorem 1.4. Let E 1 be a finite extension of F, and let E 2 be a finite 
extension of E±. Then E 2 is a finite extension of F, and 

[£ 2 : F] = [F 2 : FJCF, : F], 

Proof Let {a l5 be a basis of E 1 over F, and {Pu--.,P m } a basis 

of E 2 over E v We prove that the elements {o/Lfifi form a basis of E 2 
over F. Let v be an element of F 2 - We can write 

V = £ w jPj = w lPl + • ■ • + w m /? m 
j 

with some elements w j gE 1 . We write each w, as a linear combination of 
a l5 ...,a n with coefficients in F, say 

Wj = Z CyOt,. 

i 

Substituting, we find 

v = Z Z c u y -i(h- 

j » 

Hence the elements generate F 2 over F. Assume that we have a 
relation 


o = Z Z x ij a iPj 

j i 

with x^gF. Thus 

Z (z Pj = o. 

From the linear independence of over E l9 we conclude that 

Z x U a i = ° 

i 

for each j, and from the linear independence of ot l5 over F we con- 
clude that Xij = 0 for all i, j as was to be shown. 

Let a, f be algebraic over F. We suppose a, ft are contained in some 
extension F of F. Then a fortiori , p is algebraic over F(ot). We can form 
the field F(a)(/?). Any field which contains F and a, p , will contain F(a)(/?). 
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Hence F(ot)(/?) is the smallest field containing F and both a, /?. Further- 
more, by Theorem 1.4, F(ot)(/?) is finite over F, being decomposed in the 
inclusions 

F c= F(ot) c F(ot)(/?). 

Hence by Theorem 1.1, the field F(ot)(/?) is algebraic over F and in 
particular ol/3 and a + /? are algebraic over F. Furthermore, it does not 
matter whether we write F(a)(/?) or F(/?)(a). Thus we shall denote this 
field by F(a, /?). 

Inductively, if ql u . ..,ot r are algebraic over F, and contained in some 
extension of F, we let F(a ls . . . ,a r ) be the smallest field containing F and 
It can be expressed as F(ot 1 )(a 2 ) • * • (ot r ). It is algebraic over F by 
repeated applications of Theorem 1.4. We call it the field obtained by 
adjoining a 1? ...,a r to F, and we say that F(a 1? . . .,a r ) is generated by 
a l5 ...,a r over F, or that a 1? ...,a r are generators over F. 

If S is a set of elements in some field containing F, then we denote by 
F(S) the smallest field containing S and F. If, for instance, S consists of 
elements a ls a 2 , « 3 » then F(S) is the union of all the fields 


M 

U F (“i- • • • >°0- 

r- 1 


Let \i n be the group of n - th roots of unity in some extension of F. Then 
we shall often consider the field 


f(m). 


Since \i n is cyclic by Theorem 1.10 of Chapter IV, it follows that if £ is a 
generator of )i n , then 


%j = F(0. 

Remark. An extension E of F can be algebraic without its being finite. 
This can happen of course only if E is generated by an infinite number 
of elements. For instance, let 

E = Q(2 1/2 , 2 1/3 , . . . ,2 1/w , . . .), 

so E is obtained by adjoining all the n - th roots of 2 for all positive 
integers n. Then E is not finite over Q. 


Warning. We have just used the notation 2 i/n to denote an n - th root 
of 2. Of course, we can mean the real n - th root of 2, and one usually 
means this real root. You should know that in the complex numbers. 
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there are other elements a such that i n = 2, and these also have a right 
to be called n- th roots of 2. Therefore in general, I strongly advise 
against using the notation a 1,n to denote an n- th root of an element a in 
a field, because such an element is not uniquely determined. One should 
use a letter, for instance a, to denote an element whose n- th power is a , 
that is ot n = a. The totality of such elements can be denoted by oq, 

Whenever we have a sequence of extension fields 

F cz E x a E 2 <= ••• <= E r 

we call such a sequence a tower of fields. One could express Theorem 1.4 
by saying that: 

The degree is multiplicative in towers. 

We could call the field E = Q(2 1/2 , 2 l/3 , . . . ,2 1/n , . . .) an infinite tower, 
defined by the sequence of fields 

Q cz Q(2 1/2 ) cz Q(2 1/2 , 2 1/3 ) cz Q(2 1/2 , 2 1/3 , 2 1/4 ) cz • . • . 

We now return to possibly infinite algebraic extensions in general. 

Let A be the set of all complex numbers which are algebraic over Q. 
Then A is a field which is algebraic over Q but is not finite over Q. 
(Exercise 16). 

We define a field A to be algebraically closed if every polynomial of 
degree ^ 1 with coefficients in A has a root in A. It follows that if / is 
such a polynomial, then / has a factorization 


fit) = C(t - a ,)--- (t - x n ) 


with c # 0 and a l5 . . . e A. 

Let F be a field. By an algebraic closure of F we mean a field A which 
is algebraically closed and algebraic over F. We shall prove later that an 
algebraic closure exists, and is uniquely determined up to isomorphism. 

Example. It will be proved later in this book that the complex 
numbers C are algebraically closed. The set of numbers in C which are 
algebraic over Q form a subfield, which is an algebraic closure of Q. 

Let E u E 2 be extensions of a field F, and suppose that F 1? E 2 are 
contained in some larger field K. By the composite E X E 2 we mean the 
smallest subfield of K containing and F 2 . This composite exists, and 
is the intersection of all subfields of K containing E x and F 2 . Since there 
is at least one such subfield, namely K itself, the intersection is not 
empty. If E x = F(ot) and E 2 = F(/?) then E X E 2 = F(ot, /?). 
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Remark. If E u E 2 are not contained in some larger field, then the 
notion of “composite” field does not make sense, and we don’t use it. 
Given a finite extension E of a field F, there may be several ways this 
extension can be embedded in a larger field containing F. For instance, 
let E = Q(a) where a is a root of the polynomial t 3 — 2. Then there are 
three ways E can be embedded in C, as we shall see later. The element a 
may get mapped on the three roots of this polynomial. Once we deal 
with subfields of C, then of course we can form their composite in C. We 
shall see in Chapter IX that there are other natural fields in which we 
can embed finite extensions of Q, namely the “p-adic completions” of Q. 
A priori, there is no natural intersection between such a p-adic comple- 
tion and the real numbers, except Q itself. 

Let E u E 2 be finite extensions of F, contained in some larger field. 
We are interested in the degree [ E l E 2 :F ]. Suppose that Fj n F 2 = F. 
It does not necessarily follow that [E 1 E 2 \E 2 ] = [F^F]. 


Example. Let a be the real cube root of 2, and let /? be a complex 
cube root of 2, that is a = £/?, where £ = e 2nil3 for example. Let 
E x = Q(a) and E 2 = Q(0). Then 


because [E l :E 1 n F 2 ] divides 3 by Theorem 1.4, and cannot equal 1 
since E 1 # E 2 . Hence by Theorem 1.4, [F : n F 2 : Q] = 1, so E x n E 2 = Q. 
Note that 



F 


F x n F 2 = Q 


£iTi = Q(a, 0) = Q(a, 


as you can verify easily since 


c = 



2 


We have [Fj : Q] = [F 2 :Q] = 3, and [F 1 F 2 :F 1 ] = 2. However, as you 
will see in Exercise 12, this dropping of degrees cannot occur if [F L :Q] 
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and [£ 2 ; Q] are relatively prime. But the dropping of degrees is 
compatible with a general fact: 

Proposition 1.5. Let E/F be a finite extension , and let F' be any 
extension of F. Suppose that F, F' are contained in some field , and let 
EF' be the composite . Then 

[FF':F'] ^ IE: FI 

Proof. Exercise 11. Write E = F{ a. l9 ...,a r ) and use induction. The field 
diagram illustrating the proposition is as follows. 

EF ' 

E 
\ 

F 

Because of the similarity of the diagram with those in plane geometry, 
one calls the extension EF'/F ' the translation of E/F over F'. Sometimes, 
F is called the base field of the extension E/F , in which case the extension 
EF'/F' is also called the base change of E/F over F'. 

We have now met three basic constructions of field extensions: 

Translation of an extension. 

Tower of extensions. 

Composite of extensions. 

We shall find these constructions occurring systematically through the 
rest of field theory, in various contexts. 

So far, we have been dealing with arbitrary fields. We now come to a 
result for which it is essential to make some restrictions. Indeed, let K be 
a field and let f(t)eK[f] be a polynomial. We have seen how to define 
the derivative f'(t). If K has characteristic p, then this derivative may be 
identically zero. For instance, let 

m = t p . 

Then / has degree p, and f(t) = pt p ~ l = 0 because p = 0 in K. Such a 
phenomenon cannot happen in characteristic 0. 

Theorem 1.6. Let F be a subfield of an algebraically closed field A of 
characteristic 0. Let f be an irreducible polynomial in F[t]. Let 
n = deg 1. Then f has n distinct roots in A. 
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Proof. We can write 


f(t) = a„(t — a,) - (T — a„) 


with a* e A. Let a be a root of f It will suffice to prove that a has 
multiplicity 1. We note that / is the irreducible polynomial of a over F. 
We also note that the formal derivative /' has degree < n. (Cf. Chapter 
IV, §3.) Hence we cannot have /'(a) = 0, because /' is not the zero poly- 
nomial (immediate from the definition of the formal derivative— the 
leading coefficient of /' is na n # 0). Hence a has multiplicity 1 by Theorem 
3.6 of Chapter IV. This concludes the proof of the theorem. 

Remark. The same proof applies to the following statement, under a 
weaker hypothesis than characteristic zero. 

Let F be any field and f an irreducible polynomial in F[t] of degree 
n ^ 1. If the characteristic is p and p f n, then every root of f has 
multiplicity 1. 


VII, §1. EXERCISES 

1. Let a 2 = 2. Show that the field Q(a) is of degree 2 over Q. 

2. Show that a polynomial (t — a) 2 + b 2 with a , b rational, b # 0, is irreducible 
over the rational numbers. 

3. Show that the polynomial r 3 — p is irreducible over the rational numbers for 
each prime number p. 

4. What is the degree of the following fields over Q? Prove your assertion. 

(a) Q(a) where a 3 = 2 

(b) Q(a) where a 3 = p (prime) 

(c) Q(a) where a is a root of F — t — 1 

(d) Q(a, P) where a is a root of t 2 — 2 and ft is a root of t 2 — 3 

(e) v % 

5. Show that the cube root of unity £ = e 2nili is the root of a polynomial of 
degree 2 over Q. Show that Q(f) = Q( x /^3). 

6. What is the degree over Q of the number cos(27r/6)? Prove your assertion. 

7. Let F be a field and let a,beF. Let a 2 = a and ft 2 = b. Assume that 
a, p have degree 2 over F. Prove that F(a) = F(P) if and only if there exists 
cg F such that a = c 2 b. 

8. Let E 2 be two extensions of a field F. Assume that [ E 2 :F£\ = 2, and that 
Fj n E 2 = F. Let E 2 = F( a). Show that F^a) has degree 2 over F : . 

9. Let ot 3 = 2, let £ be a complex cube root of unity, and let p = £<x. What is the 
degree of Q(a, p) over Q? Prove your assertion. 
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10. Let E l have degree p over F and E 2 have degree p' over F, where p, p' are 
prime numbers. Show that either E l = E 2 or E x r\ E 2 = F. 

11. Prove Proposition 1.5. 

12. Let Ei, E 2 be finite extensions of a field F, and assume E u E 2 contained in 
some field. If \_E 1 :F] and [F 2 :F] are relatively prime, show that 

[£ 1 £ 2 :F] = [£ 1 :F][£ 2 :F] and [E.E,: E 2 ] = \_E X : F]. 

In Exercises 13 and 14, assume F has characteristic 0. 

13. Let E be an extension of degree 2 of a field F. Show that E can be written 

F(a) for some root a of a polynomial t 2 — a , with aeF. [Hint: Use the 

high school formula for the solution of a quadratic equation.] 

14. Let t 2 + bt + c be a polynomial of degree 2 with b, c in F. Let a be a root. 

Show that F(a) has degree 2 over F if b 2 — 4c is not a square in F, and 

otherwise, that F( a) has degree 1 over F, i.e. aeF. 

15. Let aeC, and a # 0. Let a be a root of t n — a. Show that all roots of t n — a 
are of type (a, where ( is an n-th root of unity, i.e. 

C = c 2nifc/n , k = 0, . . . ,« - 1. 

16. Let /I be the set of algebraic numbers over Q in the complex numbers. Prove 
that A is a field. 

17. Let E = F(a) where a is algebraic over F, of odd degree. Show that 
F = F(a 2 ). 

18. Let a, /? be algebraic over the field F. Let / be the irreducible polynomial of 
a over F, and let g be the irreducible polynomial of /? over F. Suppose deg / 
and degp are relatively prime. Show that g is irreducible over F(a). 

19. Let a, /? be complex numbers such that a 3 = 2 and = 3. What is the 
degree of Q(a, fi) over Q? Prove your assertion. 

20. Let F = Q[^/3] be the set of all elements a + b^J 3, where a , b are rational 
numbers. 

(a) Show that F is a field and that 1, ^3 is a basis of F over Q. 



Show that M 2 — 

(c) Let / = I 2 be the unit 2x2 matrix. Show that the map 

a + b^/ 3 i— ► al 4- bM 

is an isomorphism of F onto the subring of Mat 2 (Q) generated by M over 

Q 
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21. Let a be a root of the polynomial t 3 + t 2 + 1 over the field F = Z/2Z. What 
is the degree of oe over F? Prove your assertion. 

22. Let t be a variable and let K = C(t) be the field of rational functions in t. Let 
/ (X) = X n — t for some positive integer n . Let a be a root of / in some field 
containing C(t). What is the degree of K( a) over K? Prove your assertion. 

23. Let F be a field and let t be transcendental over F, that is, not algebraic over F. 
Let x e F(t) and suppose x $ F. Prove that F(t) is algebraic over F(x). If you 
express x as a quotient of relatively prime polynomials, x = f(t)/g(t), how 
would you describe the degree [F(/) : F(x)] in terms of the degrees of / and gl 
Prove all assertions. 

24. Let F be a field and p(t) an irreducible polynomial in F[f]. Let g(t) be an 
arbitrary polynomial in F[r]. Suppose that there exists an extension field E 
of F and an element <xeE which is a root of both p(t) and g(t). Prove that 
p(t) divides g(t) in F[t]. Is this conclusion still true if we do not assume p(t) 
irreducible? 


VII, §2. EMBEDDINGS 

Let F be a field, and L another field. By an embedding of F in L, we 
shall mean a mapping 

cr: F-+L 

which is a ring-homomorphism. Since <r(l) = 1 it follows that a is not 
the zero mapping. If xeF and x # 0 then 

xx _1 = 1 => n(x)n(x) ~~ 1 = 1, 

so (7 is a homomorphism both for the additive group of F and the 
multiplicative group of non-zero elements of F. Furthermore, the kernel 
of a viewed as homomorphism of the additive group is 0. It follows that 
a is injective, i.e. <r(x) # a (y) if x # y. This is the reason for calling o an 
embedding. We shall often write ax instead of a(x\ and aF instead of 

An embedding a: F F is said to be an isomorphism if the image of 
a is F\ (One should specify an isomorphism of fields, or a field- 
isomorphism but the context will always make our meaning clear.) If 
a: F -> L is an embedding, then the image aF of F under a is obviously 
a subfield of L, and thus a gives rise to an isomorphism of F with aF. 
If a: F -> F' is an isomorphism, then one can define an inverse isomor- 
phism a~ 1 : F -> F in the usual way. 

Let / (t) be a polynomial in F[f]. Let a: F L be an embedding. 
Write 

/(0 — + •“ + a o - 
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We define of to be the polynomial 

of (0 = o(a n )t n + • • • + o(a 0 ). 

Then it is trivially verified that for two polynomials /, g in F[r], we have 

o(f + g) = of +og and o(fg) = (of)(og). 

If p(t ) is an irreducible polynomial in F[r], then crp is irreducible over 
oF. 

This is an important fact. Its proof is easy, for if we have a factorization 

op = gh 

over oF , then 


P = o x op = (o 1 g)(o 1 h) 
has a factorization over F. 

Let /(r)eF[r], and let a be algebraic over F. Let a:F(a)->L be an 
embedding into some field L. Then 

(of)(oot) = o(f( a)). 

This is immediate from the definition of an embedding, for if f(t ) is as 
above, then 

/(a) = a„ a" + ••• + a 0 , 

whence 

(*) <K/(a)) = o(a„)o( a)" + • • • + o(a 0 ). 

In particular, we obtain two important properties of embeddings: 

Property 1. If ot is a root of f then o(cl) is a root of of 

Property 2. If o is an embedding of F( a) whose effect is known on F and 
on a, then the effect of o is uniquely determined on F( a) by (*). 

Let o: F -> L be an embedding. Let E be an extension of F. An em- 
bedding t: F — ► L is said to be an extension of o if t(x) = (t(x) for all 
xeF. We also say that o is a restriction of t to F, or that t is over o. 

Let E be an extension of F and let o be an isomorphism, or 
embedding of F which restricts to the identity on F. Then o is said to be 
an isomorphism or embedding of E over F. 


[VII, §2] 


EMBEDDINGS 


269 


Theorem 2.1. Let o\ F -> L be an embedding. Let p(t) be an irreducible 
polynomial in F[r]. Let a be a root of p in some extension of F, and let 
P be a root of op in L. Then there exists an embedding t: F(a) — > L 
which is an extension of a, and such that rot = ft. Conversely , every 
extension r of o to F{ a) is such that m is a root of op. 

Proof. The second assertion follows from Property 1. To prove the 
existence of t, let / be any polynomial in F[r], and define r on the 
element /(a) to be (of)(fi). The same element /(a) has many representa- 
tions as values g(ot), for many polynomials g in F[r]. Thus we must show 
that our definition of r does not depend on the choice of f. Suppose 
that f,ge F[r] are such that /(a) = g(ot). Then (/ — g)(ot) = 0. Hence 
there exists a polynomial h in F[r] such that / — g = ph. Then 

of = og + ( op)(oh ). 

Hence 


(*m) = (*gM + (o P m-(ohm 
= (<rg)(P)- 

This proves that our map is well defined. We used that fact that p is 
irreducible in an essential way! It is now a triviality to verify that r is 
an embedding, and we leave it to the reader. 

Special case. Suppose that o is the identity on F, and let a, ft be two 
roots of an irreducible polynomial in F[r]. Then there exists an 
isomorphism 

t: F(a) -> F(f) 

which is the identity of F and which maps ot on ft. 

There is another way of describing the isomorphism in Theorem 2.1, 
or the special case. 

Let p(t) be irreducible in F[r]. Let ot be a root of p in some field. 
Then (p) is the kernel of the homomorphism 

F[t]-F(a) 

which is the identity on F and maps t on a. Indeed, p(a) = 0 so p(t) is in 
the kernel. Conversely, if f(t)eF[f\ and /(a) = 0 then p\f otherwise the 
greatest common divisor of p, / is 1 since p is irreducible, and 1 does not 
vanish on a, so this is impossible. Hence we get an isomorphism 


a- P I>]/(P) -► Ha). 
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Similarly, we have an isomorphism t: F[f]/(p) -► F(j I). The isomorphism 

T<T ~ 1 : F( a) -* F(jf) 

maps a on and is the identity on F. We can represent to --1 as in the 
following diagram. 


F(a)4F[#)iF(/J). 

So far, we have been given a root of the irreducible polynomial in 
some field. The above presentation suggests how to prove the existence 
of such a root. 

Theorem 2.2. Let F be a field and p(t) an irreducible polynomial of 
degree ^ 1 in F[r]. Then there exists an extension field E of F in which 
p has a root. 

Proof. The ideal (p) in F[r] is maximal, and the residue class ring 

F[r]/(p) 

is a field. Indeed, if f(t)eF[t] and p f f, then (f,p) = 1 so there exist 
polynomials g(t ), h(t) e F[r] such that 

gf +hp= 1. 

This means that gf = 1 modp, whence /mod p is invertible in F[t]/{p). 
Hence every non-zero element of F[r]/(p) is invertible, so F[f]/(p) is a 
field. This field contains the image of F under the homomorphism which 
to each polynomial assigns its residue class mod p(t). Since a field has 
no ideals other than (0) and itself, and since 1 ^ 0 mod p(t) because 
deg p ^ 1, we conclude that the natural homomorphism 

* F - F[r]/(p(r)) 

is an injection. Thus F[f]/(p(f)) is a finite extension of oF. If we then 
identify F with o F, we may view F[r]/(p(r)) as a finite extension of F 
itself, and the polynomial p has a root a in this extension, which is equal 
to the residue class of t mod p(f). Thus we have constructed an extension 
E of F in which p has a root. 

In the next section, we shall show how to get a field in which “all” the 
roots of a polynomial are contained, in a suitable sense. 

Theorem 2.3 (Extension Theorem). Let E be a finite extension of F. 
Let o'. F —* A be cm embedding of F into an algebraically closed field A. 
Then there exists *n extension of o to an embedding o: E -> A. 
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Proof. . Let E = F(y. u . . . ,y n ). By Theorem 2.1 there exists an extension 
cr { of o to an embedding o x : F(y { ) -> A. By induction on the number of 
generators, there exists an extension of o x to an embedding g: E -> A, as 
was to be shown. 

Next, we look more closely at the embeddings of a field E(a) over F. 

Let y be algebraic over F. Let p(t) be the irreducible polynomial of y 
over F. Let y u ...,y n be the roots of p. Then we call these roots the 
conjugates of y over F. By Theorem 2.1, for each oq, there exists an 
embedding of F(y) which maps y on y h and which is the identity on F. 
This embedding is uniquely determined. 

Example 1. Consider a root y of the polynomial t 3 — 2. We take y to 
be the real cube root of 2, written y = ifl. Let 1, (, C 2 be the three cube 
roots of unity. The polynomial f 3 — 2 is irreducible over Q, because it 
has no root in Q (cf. Exercise 2 of Chapter IV, §3). Hence there exist 
three embeddings of Q(a) into C, namely the three embeddings <r l9 o 2 > 
such that 


G Y y = y, G 2 y = Ca, G 3 y = £ 2 a . 

Example 2. If a = 1 + yj 2 , there exist two embeddings of Q(a) into C, 
namely those sending y to 1 + and 1 — respectively. 

Example 3. Let F be a field of characteristic p, and let ae E. Consider 
the polynomial t p — a. Suppose a is a root of this polynomial in some 
extension field E. Then y is the only root; there is no other root, because 

(t - oc) p = t p -y p = t p - a, 

and we apply the unique factorization in the polynomial ring £[r]. 
Hence the only isomorphism of E over F in some field containing E 
must map y on a, and hence is the identity on E(y). 

Because of the above phenomenon, the rest of this section will be 
under assumptions which preclude the phenomenon. 

For the rest of this section, we let A denote an algebraically closed field 
of characteristic 0. 

The reason for this assumption lies in the following corollary, which 
makes use of the key Theorem 1.6. 

Corollary 2.4. Let E be a finite extension of F. Let n be the degree of 
E over F . Let g\ E -► A be an embedding of E into A. Then the number 
of extensions of a to an embedding of E into A is equal to n. 
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Proof. Suppose first E = F(y.) and f{t)eF\_t] is the irreducible poly- 
nomial of y. over F, of degree n. Then of has n distinct roots in A by 
Theorem 1.6, so our assertion follows from Theorem 2.1. In general, we 
can write E in the form E = F(a 1 ,. . . ,a r ). Consider the tower 

F ci F(oq) c= F(y u a 2 ) c= • ■ ■ c: F( a 1? . . . ,a r ). 

Let E r _ 1 =F( a a r _!). Suppose that we have proved by induction 
that the number of extensions of o to E r _ l is equal to the degree 
[F r _j: F]. Let o l9 ...,o m be the extensions of o to F r _j. Let d be the 
degree of a r over E r _ v For each i = 1, ...,m we can find precisely d ex- 
tensions of o t to F, say o n ,...,o id . Then it is clear that the set {o u } 
(i = 1 and j= l,...,d) is the set of distinct extensions of tr to F. 

This proves our corollary. 


Theorem 2.5 (Primitive element theorem). Let F be a field of character- 
istic 0. Let E be a finite extension of F. Then there exists an element y 
of E such that E = F(y). 


Proof It will suffice to prove that if F = F(a, jC) with two elements 
a, jf algebraic over F, then we can find y in F such that F = F(y), for 
we can then proceed inductively. Let [F : F] = n. Let o h ...,o n be the n 
distinct embeddings of F into A extending the identity map on F. We 
shall first prove that we can find an element ceF such that the elements 

o x a + co ^ = Oi(y + cf) 

are distinct, for i = l,...,w. We consider the polynomial 

n 

n n - a i ct + tivjt - 

i= 1 j * i 

It is not the zero polynomial, since each factor is different from 0. This 
polynomial has a finite number of roots. Hence we can certainly find an 
element c of F such that when we substitute c for t we don’t get the 
value 0. This element c does what we want. 

Now we assert that F = F(y), where y = y + c/t. In fact, by construc- 
tion, we have n distinct embeddings of F(y) into A extending the identity 
on F, namely o u ,..,o n restricted to F(y). Hence [F(y) : F] ^ n by Cor- 
ollary 2.4. Since F(y) is a subspace of E over F, and has the same 
dimension as F, it follows that F(y) = F, and our theorem is proved. 

Example 4. Prove as an exercise that if a 3 = 2 and jf is a square root 
of 2, then Q(a, jC) = Q(y), where y = y + jC. 
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Remark. In arbitrary characteristic, Corollary 2.4, and Theorem 
2.5 are not necessarily true. In order to obtain an analogous theory 
resulting from the properties of extensions expressed in those corollaries, 
it is necessary to place a restriction on the types of finite extensions that 
are considered. Namely, we define a polynomial /(t) £ F[f] to be 
separable if the number of distinct roots of / is equal to the degree of f 
Thus if / has degree d , then / has d distinct roots. An element a of £ is 
defined to be separable over £ if its irreducible polynomial over F is 
separable, or equivalently if a is the root of a separable polynomial in 
£[f]. We define a finite extension £ of £ to be separable if it satisfies the 
property forming the conclusion of Corollary 2.4, that is: the number of 
possible extensions of an embedding o\ F -> A to an embedding of £ into 
A is equal to the degree [£:£]. You can now prove: 

Let the characteristic he arbitrary . Let E he a finite extension of F. 

(a) If E is separable over F and a e £, then the irreducible polynomial of 
a over F is separable , and so every element of E is separable over £. 

(b) If £ = £(a 1? . . .,a r ) and if each a, is separable over £, then E 
is separable over F. 

(c) If E is separable over F and F is any extension of F such that £, F 
are subfields of some larger field , then £F is separable over F. 

(d) // £ c= £j cz £ 2 is a tower of extensions such that E r is separable 
over F and E 2 is separable over £ 1? then £ 2 is separable over F 

(e) If E is separable over F and E { is a subfield of E containing £, then 
£j is separable over F. 

The proofs are easy, and consist merely of further applications of the 
techniques we have already seen. However, in first reading, it may be 
psychologically preferable for you just to assume characteristic 0, to get 
into the main theorems of Galois theory right away. The technical 
complication arising from lack of separability can then be absorbed 
afterwards. Therefore we shall omit the proof of the above statements, 
which you can look up in a more advanced text, e.g. my Algebra if you 
are so inclined. 

Note that the substitute for Theorem 2.5 in general can be formulated 
as follows. 

Let E be a finite separable extension of £. Then there exists an element 

y of E such that E = F(y). 

The proof is the same as that of Theorem 2.5. Only separability was 
needed in the proof, except for finite fields, in which case we use Theorem 
1.10 of Chapter IV. 

The same principle will apply later in the Galois theory. We shall 
define the splitting field of a polynomial. In arbitrary characteristic, we 
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define a Galois extension to be the splitting field of a separable 
polynomial. Then the statements and proofs of the Galois theory go 
through unchanged. Thus the hypothesis of separability replaces the too 
restrictive hypothesis of characteristic 0 throughout. 


VII, §2. EXERCISES 

All fields in these exercises are assumed to he of characteristic 0. 

1 . In each case, find an element y such that Q(a, /?) = Q(y). Always prove all 
assertions which you make. 

(a) a = 7^5, p = (b) a=^/2, 0 = ^2 

(c) a = root of t 3 — t + 1 , fi = root of t 2 — t — 1 

(d) a = root of t 3 — It + 3, /? = root of t 2 A t + 2 

Determine the degrees of the fields Q(a, /?) over Q in each one of the cases. 

2. Suppose P is algebraic over F but not in F, and lies in some algebraic closure 
A of F. Show that there exists an embedding of F(/?) over F which is not the 
identity on j B. 

3. Let E be a finite extension of F, of degree n. Let o l ,... y o n be all the distinct 

embeddings of E over F into A. For aeF, define the trace and norm of a 

respectively (from E to F), by 

n 

Trf(a) = X a,a = op. + • • • + a„a, 

i= 1 
« 

AT^(a) = Y[ G i CL = G i a •* * cr n a * 

i= 1 

(a) Show that the norm and trace of a lie in F. 

(b) Show that the trace is an additive homomorphism, and that the norm is a 
multiplicative homomorphism. 

4. Let a be algebraic over the field F, and let 

p(t) = t n + + ••• + a 0 

be the irreducible polynomial of a over F. Show that 

N(ql) = (— l) n a 0 and Tr(a) = —a n _ 1 . 

(The norm and trace are taken from F(a) to F.) 

5. Let E be a finite extension of F, and let a be an element of F. Let 

[F:F] =n. 


What are the norm and trace of a from E to F? 
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6. Let a be algebraic over the field F. Let m a \ F(a) -» F{y) be multiplication by a, 
which is an F-linear map. We assume you know about determinants of linear 
maps. Let D(a) be the determinant of m x? and let T( a) be the trace of (The 
trace of a linear map is the sum of the diagonal elements in a matrix 
representing the linear map with respect to a basis.) Show that 

D( a) = N(ol) and F(a) = Tr(a), 

where N and Tr are the norm and trace of Exercise 3. 

VII, §3. SPLITTING FIELDS 

In this section we do not assume characteristic 0. 

Let £ be a finite extension of F. Let o be an embedding of F, and t an 
extension of o to an embedding of £. We shall also say that r is over er. 
If a is the identity map, then we say that i is an embedding of £ over F. 
Thus t is an embedding of E over F means that rx = x for all xeF. 
We also say that t leaves F fixed. 

If r is an embedding of £ over F, then t is called a conjugate mapping 
and the image field t£ is called a conjugate of £ over F. Observe that if 
xe£, then ra is a conjugate of i over F. Thus we apply the word 
"conjugate" both to elements and to fields. By Corollary 2.4, if F has 
characteristic 0, then the number of conjugate mappings of £ over F is 
equal to the degree [£ : F]. 

Let /(£)eF[T] be a polynomial of degree n ^ 1. By a splitting field K 
for / we mean a finite extension of F such that / has a factorization in K 
into factors of degree 1, that is 


/(t) = c(f -«!)••• (f - a„). 


with cgF the leading coefficient of J] and K = F(aj, Thus roughly 
we may say that a splitting field is the field generated by “all" the roots 
of / such that / factors into factors of degree 1. 

A priori, we could have a factorization as above in some field K , and 
another factorization 




with jf, in some field L = F(jf l5 . The question arises whether there 
is any relation between K and L, and the answer is that these splitting 
fields must be isomorphic. The next theorems prove both the existence of 
a splitting field and its uniqueness up to isomorphism. 

Theorem 3.1. Let f(t)eF[t~\ be a polynomial of degree ^ 1. Then there 
exists a splitting field for f 


276 


FIELD THEORY 


[VII, §3] 


Proof. By induction on the degree of f Let p be an irreducible factor 
of f By Theorem 2.2, there exists an extension E Y — F(oti) with some 
root oq of p, and hence of f Let 

fit) = (t ~ ocJg(t) 

be a factorization of / in E^lf]. Then deg g = deg/— 1. Hence by 
induction, there exists a field E = Efocj, . such that 

g(t) = c(t - a 2 ) • • • (t - cc n ) 

with some element ceF. This concludes the proof. 

The splitting field of a polynomial is uniquely determined up to 
isomorphism. More precisely: 

Theorem 3.2. Let f(t)e F[7] be a polynomial of degree ^ 1. Let K and 
L be extensions of F which are splitting fields of f Then there exists an 
isomorphism 

o.K^L 

over F. 

Proof Without loss of generality, we may assume that the leading 
coefficient of / is 1. We shall prove the following more precise statement 
by induction. 

Let o:F->oF be an isomorphism. Let f{t)eF[t ] be a polynomial with 
leading coefficient 1, and let K= F(%i, . ..,a„) be a splitting field for f 
with the factorization 

fit) = n (f — a s ). 

i = 1 


Let L be a splitting field of of with L = (oF)(j . . . ,f n ) and 

«m = n (r - *)• 

i — 1 

Then there exists an isomorphism r: K — > L extending a such that, after a 
permutation of if necessary , we have for i = 1 ,...,n. 

Let p(t) be an irreducible factor of / By Theorem 2.1, given a root <x 1 
of p and a root of op , there exists an isomorphism 


F(oci)->(<xF)<j3i) 
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extending <x and mapping on p l . We can factor 

f(t) = (t- oti )g(t) over F(ai), 
of{t) = {t - Pi)*ig(t) over aFiPx), 

because z l a 1 = By induction, applied to the polynomials g and z 1 g 9 
we obtain the isomorphism z, which concludes the proof. 

Remark. Although we have collected together theorems concerning 
finite fields in a subsequent chapter, the reader may wish to look at that 
chapter immediately to see how finite fields are determined up to 
isomorphism as splitting fields. These provide nice examples for the 
considerations of this section. 

By an automorphism of a field K we shall mean an isomorphism 
o: K -> K of K with itself. The context always makes it clear that we 
mean field-isomorphism (and not another kind of isomorphism, e.g. 
group, or vector space isomorphism). 

Let o be an embedding of a finite extension K of F, over F. Assume 

that o(K) is contained in K. Then o(K) = K. 

Indeed, o induces a linear map of the vector space K over F, and is 
injective. You should know from linear algebra that a is then surjective. 
Indeed, dim F (K) = dim f {gK). If a subspace of a finite dimensional vector 
space has the same dimension as the vector space, then the subspace is 
equal to the whole space. Then it follows that o is an isomorphism of 
fields, whence an automorphism of K. 

We observe that the set of all automorphisms of a field K is a group. 
Trivial verification. We shall be concerned with certain subgroups. 

Let G be a group of automorphisms of a field K. Let K G be the set 
of all elements xeK such that ox = x for all geG. Then K° is a field. 
Indeed, K G contains 0 and 1. If x, y are in K G , then 

o(x + y) = gx + oy = x + y, 
o(xy) = o(x)o(y) = xy, 

so x + y and xy are in K G . Also ^(x' 1 ) = o(x) ~ 1 = x _1 , so x -1 is in 

K G . This proves that K G is a field, called the fixed field of G. 

If G is a group of automorphisms of K over a subfield F, then F is 

contained in the fixed field (by definition), but the fixed field may be big- 

ger than F. For instance, G could consist of the identity alone, in which 
case its fixed field is K itself. 


278 


FIELD THEORY 


[VII, §3] 


Example 1. The field of rational numbers has no automorphisms 
except the identity. Proof? 

Example 2. Prove that the field Q(oc) where a 3 = 2 has no auto- 
morphism except the identity. 

Example 3. Let F be a field of characteristic #2, and aeF. Assume 
that a is not a square in F , and let oc 2 = a. Then F(a) has precisely two 
automorphisms over F, namely the identity, and the automorphism 
which maps a on — a. 

Remark. The isomorphism of Theorem 3.2 between the splitting fields 
K and L is not uniquely determined. If cr, t are two isomorphisms of K 
onto L leaving F fixed, then 


or 1 : L -> L 

is an automorphism of L over F. We shall study the group of such 
automorphisms later in this chapter. We emphasize the need for the 
possible permutation of /J l5 in the statement of the result. Further- 
more, it will be a problem to determine which permutations of these 
roots actually may occur. 

A finite extension K of F in an algebraically closed field A will be said 
to be normal extension if every embedding of K over F in A is an 
automorphism of K. 

Theorem 3.3. A finite extension of F is normal if and only if it is the 

splitting field of a polynomial. 

Proof Let K be a normal extension of F. Suppose K = F(oc) for 
some element a. Let p(t) be the irreducible polynomial of a over F. For 
each root a,- of p , there exists a unique embedding cr, of K over F such 
that ofi = a f . Since each embedding is an automorphism, it follows that 
a f is contained in K. Hence 

K = F(cc ) = F(ot ot„), 
and K is the splitting field of p. 

If F has characteristic 0, then we are done by Theorem 2.5, because 
we know K = F(oc) for some a. In general, K = F(a l5 . . . ,a r ), and we can 
argue in the same way with each irreducible polynomial pft) with re- 
spect to a f . We then let f(t) be the product 


/(0 = Pl(0-"Pr(0- 
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Assuming that K is normal over F, it follows that K is the splitting field 
of the polynomial /(t), which in this case is not irreducible, but so what? 

Conversely, suppose that K is the splitting field of a polynomial /(f), 
not necessarily irreducible, with roots a 1? ...,a n . If o is an embedding of 
K over F, then croc, must also be a root of /. Hence o maps K into itself, 
and hence o is an automorphism. 

Theorem 3.4. Let K be a normal extension of F. If p(t) is a polynomial 
in F[f], and is irreducible over F, and if p has one root in K , then p 
has all its roots in K. 

Proof Let a be one root of p in K. Let /? be another root. By 
Theorem 2.1 there exists an embedding o of F( a) on F(/?) mapping a on 
/?, and equal to the identity on F. Extend this embedding to K. Since 
an embedding of K over F is an automorphism, we must have ctoleK , 
and hence fleK. 


VII, §3. EXERCISES 

1 . Let F be a field of characteristic p. Let ceF and let 

f(t) = tP-t~c. 

(a) Show that either all roots of / lie in F or / is irreducible in F[f]. [ Hint : 
Let a be a root, and a eZ/pZ. What can you say about a + al For the 
irreducibility, suppose there is a factor g such that 1 < deg g < deg f Look 
at the coefficient of the term of next to highest degree. Such a coefficient 
must lie in F.] 

(b) Let F = Z/pZ. Let ce F and c 0. Show that t p — t — c is irreducible in 

FM- 

(c) Let F again be any field of characteristic p. Let a be a root of f Show 
that F(a) is a splitting field for f 

(d) Assume / irreducible. Prove that there is a group of automorphisms of 
F(a) over F isomorphic to Z/pZ (and so cyclic of order p). 

(e) Prove that there are no other automorphisms of F(a) over F besides those 
found in (d). 

2. Let F be a field and let n be a positive integer not divisible by the 
characteristic of F if this characteristic is ^ 0. Let ae F. Let f be a primitive 
nAh root of unity, and a one root of t n — a. Prove that the splitting field of 
f — a is F(oc, 0. 

3. (a) Let p be an odd prime. Let F be a field of characteristic 0, and let aeF , 

a ± 0. Assume that a is not a p-th power in F. Prove that t p — a is 
irreducible in F[t]. [Hint. Suppose t p — a factors over F. Look at the 
constant term of one of the factors, expressed as a product of some roots, 
and deduce that a is a p-th power in F.] 
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(b) Again assume that a is not a p- th power in F. Prove that for every 
positive integer r, t pT — a is irreducible in F[r]. [Flint: Use induction. 
Distinguish the cases whether a root oc of t pT — a is a p-th power in F(a) or 
not, and take the norm from F(a) to F. The norm was defined in the 
exercises of §2.] 

4. Let F be a field of chacteristic 0, let a e F, a # 0. Let n be an odd integer ^ 3. 
Assume that for all prime numbers p such that p\n we have a£F p (where F p 
is the set of p-th powers in F). Show that t n — a is irreducible in F[t]. [Hint: 
Write n = p r m with p f m. Assume inductively that t m — a is irreducible in 
F[r]. Show that a is not a p-th power in F(a) and use induction.] 

Remark. When n is even, the analogous result is not quite true because of 
the factorization of t 4 + 4. Essentially this is the only exception, and the 
general result can be stated as follows. 

Theorem. Let F be a field and n an integer ^ 2. Let aeF, a # 0. Assume 
that for all prime numbers p such that p\n we have a£F p , and if 4| n then 
a£ —4 F 4 . Then t n — a is irreducible in F[t~\. 

It is more tedious to handle this general case, but you can always have a try 
at it. The main point is that the prime 2 causes some trouble. 

5. Let E be a finite extension of F. Let E = E h F 2 ,...,F r be all the distinct 
conjugates of E over F. Prove that the composite 

K = E y E 2 - E r 

is the smallest normal extension of F containing E. We can say that this 
smallest normal extension is the composite of E and all its conjugates over F. 

6. Let A be an algebraic extension of a field F of characteristic 0. Assume that 
every polynomial of degree ^ 1 in F[r] has at least one root in A. Prove that 
A is algebraically closed. [Hint: If not, there is some element a in an extension 
of A which is algebraic over A but not in A. Show that a is algebraic over F. 
Let f(t) be the irreducible polynomial of a over F, and let K be the splitting 
field of / in some algebraic closure of A. Write K = F(y), and let g(t) be the 
irreducible polynomial of y over F. Now use the assumption that g has a root 
in A. Remark: the result of this exercise is valid even if we don’t assume 
characteristic 0, but one must then use additional arguments to deal with the 
possibility that the primitive element theorem cannot always be applied.] 

VII, §4. GALOIS THEORY 

Throughout this section we assume that all fields have characteristic 0. 

In the case of characteristic 0, a normal extension will also be called a 
Galois extension. 

Theorem 4.1. Let K be a Galois extension of F. Let G be the group of 
automorphisms of K over F. Then F is the fixed field of G. 
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Proof. Let F' be the fixed field. Then trivially, F c F'. Suppose a e F 
and a£F. Then by Theorem 2.1 there exists an embedding o 0 of F(a) 
over F such that <x 0 a # a. Extend <x 0 to an embedding o of K over F by 
Theorem 2.3. By hypothesis, o is an automorphism of K over F \ and 
(tol = (j 0 (x # a, thereby contradicting the assumption that aeF but a £F. 
This proves our theorem. 

If K is a Galois extension of F, the group of automorphisms of K over 
F is called the Galois group of K over F and is denoted by G K/F . If K is 
the splitting field of a polynomial f(t ) in F[t], then we also say that G K/F 

is the Galois group of f 

Theorem 4.2. Let K be a Galois extension of F. To each intermediate 
field E , associate the subgroup G K/E of automorphisms of K leaving E 
fixed. Then K is Galois over E. The map 

E i— > G K/e 

is an injective and surjective map from the set of intermediate fields 
onto the set of subgroups of G, and E is the fixed field of G K/E . 

Proof. Every embedding of K over E is an embedding over F, and 
hence is an automorphism of K. It follows that K is Galois over E. 
Furthermore, E is the fixed field of G K/E by Theorem 4.1. This shows in 
particular that the map 


E i— ► G KjE 

is injective, i.e. if E # £' then G K/E ^ G K/E .. Finally, let H be a subgroup 
of G. Write K = F(a) with some element a. Let {<r l9 . . . ,<r r } be the ele- 
ments of H, and let 


fit) = (t- Via) -ft - o r ct). 

For any o in H , we note that {crcr l9 . . . ,crcr r } is a permutation of 
{(r l5 ...,< 7 r }. Hence from the expression 


of (0 = it - ooi<x) • • • it - = f(t) 9 


we see that / has its coefficients in the fixed field E of H. Furthermore, 
K = £(a), and a is a root of a polynomial of degree r over E. Hence 
\_K : £] ^ r. But K has r distinct embeddings over E (those of H), and 
hence by Corollary 2.4, [K : F] = r, and H = G K/E . This proves our 
theorem. 
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Let /(f)eF[f], and let 

fit) = (f - - a*)- 

Let X = F(ol 19 . . . ,a„), and let cr be an element of G K/F . Then 
{gol u ...,G0L n } is a permutation of { a l5 which we may denote by n a . 
If a # t, then 7^ # 7 i r , and clearly, 

H<rt = JC,°7t T . 

Hence we have represented the Galois group G KJF as a group of 
permutations of the roots of f More precisely, we have an injective 
homomorphism of the Galois group G K/F into the symmetric group S n : 

G K/F -* S n given by a n a . 

Of course, it is not always true that every permutation of {a 1? . ..,a„} is 
represented by an element of G K/F , even if / is irreducible over F. Cf. the 
next section for examples. In other words, given a permutation n of 
{(*!,.. .,a„} such a permutation is merely a bijective map of the set of 
roots with itself. It is not always true that there exists an automorphism 
(j of X over F whose restriction to this set of roots is equal to n. Or, put 
another way, the permutation n cannot necessarily be extended to an 
automorphism of K over F. We shall find later conditions when 
G k/f ~ S n . 

The notion of a Galois extension and a Galois group are defined 
completely algebraically. Hence they behave formally under isomor- 
phisms the way one expects from general algebraic situations. We de- 
scribe this behavior explicitly. 

Let K be a Galois extension of F. Let 

A: K-* AX 

be an isomorphism. Then XK is a Galois extension of AF. Indeed, if K 
is the splitting field of the polynomial /, then A/C is the splitting field of 

A/ 

Let G be the Galois group of K over F. Then the map 

< 7 h-> A o a o A - 1 

gives a homomorphism of G into the Galois group of AX over AF, whose 
inverse is given by 

A ~ 1 o x • A <— 1 t. 

X — - — * AX 


AF 


F 
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Hence G XK/XF is isomorphic to G K/F under the above map. We may write 

Gmc/xf = ^G K/F k l . 

This is a “conjugation”, just like conjugations in the theory of groups in 
Chapter II. 

In particular, let E be an intermediate field, say 

F c E <= K. 

Let k: E — > kE be an embedding of E in K , which we assume extends to 
an automorphism of K. Then kK = K. Hence 

(*K{XE = ^G KjE k 1 . 

Theorem 4.3. Let K be a Galois extension of a field F , and let E be an 
intermediate field , F cz E c= K. Let 

H = G K / E and G = G £/ , £ . 

Then E is a Galois extension of F if and only if H is normal in G. If 
this is the case , then the restriction map 

o j * res £ o of G ^ G E j F 

induces an isomorphism of G/H onto G E/F . 

Proof Assume that G K/E is normal in G. Let k 0 be an embedding of E 
over F. It suffices to prove that k 0 is an automorphism of E. Let k be 
an extension of k 0 to K. Since K is Galois over F, it follows that k is 
an automorphism of K over F, and by assumption 

Gkie = ^G KjE k 1 = G KjXE . 

By Theorem 4.2 it follows that kE = E, so k is an automorphism of £, 
whence E is normal over F. 

Conversely suppose E normal over F. Then the restriction map 
rr i — >a\E for (jeG K/F 

is a homomorphism G K/F ^G E/F whose kernel is G K/E by definition. 
Hence G K/E is normal. This homomorphism is surjective, because given 
tr 0 e G e/f we can extend <j 0 to an embedding a of K over F, and since K 
is Galois over F, we actually have o e G K /f and ctq is the restriction of o to E. 
This concludes the proof of Theorem 4.3. 

A Galois extension K/F is said to be abelian if its Galois group is 
abelian. It is said to be cyclic if its Galois group is cyclic. 
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Corollary 4.4. Let K/F be an abelian extension. If E is an intermediate 
field , F c E a K, then E is Galois over F and abelian over F. Simi- 
larly, if K/F is a cyclic extension , then E is cyclic over F. 

Proof. This follows at once from the fact that a subgroup of an abe- 
lian group is normal and abelian, and a factor group of an abelian 
group is abelian. Similarly for cyclic subgroups and factor groups. 

The next theorem describes what happens to a Galois extension under 
translation. 

Theorem 4.5. Let K/F be a Galois extension with group G. Let E be an 
arbitrary extension field of F. Assume that K , E are both contained in 
some field , and let KE be the composite field. Then KE is Galois over 
E. The map 

Gke/e -> G k/f given by <x rcs K (a\ 

i.e. the restriction of an element of G KE/E to K , gives an isomorphism of 
Gke/e with G K/{Kr ^ E) . 

The field diagram for Theorem 4.5 is as follows: 



Proof. The extension KE/E is clearly Galois. The restriction maps 
Gke/e a priori into the set of embeddings of K over F. But since we 
assume that K is Galois over F , the image of the restriction lies in Gk/f- 
The restriction is obviously a homomorphism 

res: G KE!E -> G KjF . 

The kernel of the restriction is the identity, for if oeG KEjE is the identity 
on K then o is the identity on KE , because KE is generated by elements 
of K and elements of E. Thus we get an injective homomorphism of 
G ke/e into G K/F . Since an element of G KE/E is the identity on E , the 
restriction of this element to K is the identity on K n E, so the image of 
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the restriction lies in the Galois group of K/(K n F). Thus we get an 
injective homomorphism 


GkE/E C " Kj{ K r\E)' 

We must finally prove that this map is surjective. It will suffice to prove 
that 


[K : K n F] = [ KE : F], 

because if a subgroup of a finite group has the same order as the group, 
then the subgroup is equal to the group. So write K = F( a) for some 
generator a. Let 

/( 0 = FI ( l ~ aa )’ 

<7 G Gk£, £ 


where the product is taken over all elements oeG KEjE . Then the 
coefficients of f(t) lie in E since they are fixed under every element of 
G ke/e which permutes the roots of f(t). Furthermore, these coefficients lie 
in K , because each act e K by the hypothesis that K is normal over F. 
Hence the coefficients of / lie in K n E. Hence 

[X:Kn£] ^ deg / = [ KE : £] ^ [K : K n £], 

where this last inequality is by Proposition 1.5 to the effect that the 
degree does not increase under translation. This proves the theorem. 


Corollary 4.6. Let K/F be Galois and let E be an arbitrary extension of 
F. Then 

[KE : £] divides [K : F]. 

Furthermore if K n E = F then 

[KE :E] = [K: F]. 

Proof We have 

[K : F] = [K : K n F][/C n F : F] = [KF : F][K n£:F] 

Both assertions follow from this relation. 

In the example preceding Proposition 5, we have seen that the relations of 
Corollary 4.6 are not necessarily true when K/F is not Galois. 
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As a special case of Corollary 4.6, suppose K/F Galois and 
KnE = F . Let K = F{ a). The relation [KE: F] = [K : F] tells us that 
the degree of a over F is equal to the degree of a over E. This is a 
statement about irreducibility, namely the irreducible polynomial of a 
over F is the same as the irreducible polynomial of a over E. The 
example preceding Proposition 1.5 shows that if we do not assume F( a) 
normal over F, even if F(a) n E = F, it does not necessarily follow that 

[F(a):F] = [F(a):F]. 

Corollary 4.7. Let K/F be a Galois extension and let E be a finite 
extension. If K n E = F then 

l KE:K ] = [F : F]. 

Proof Exercise 6. 

We shall now prove an interesting theorem due to Artin, showing how 
to get a Galois extension going from the top down rather than from the 
bottom up as we have done up to now. The technique of proof is 
interesting in that it is similar to the technique used to prove Theorem 
4.5, and also uses aspects from the primitive element theorem, so you see 
field theory at work. We start with a lemma. 

Lemma 4.8. Let E be an extension field of F such that every element of 
E is algebraic over F. Assume that there is an integer n ^ 1 such that 
every element of E is of degree S n over F. Then E is a finite extension 
and [F : F] ^ n. 

Proof Let a be an element of F such that the degree [F( a) : F] is 
maximal, say m ^ n. We claim that F( a) = F. If this is not true, then 
there exists an element fie E such that f $ F( a), and by Theorem 2.5, 
there exists an element y e F(a, /?) such that F(a, ft) = F(y). But from the 
tower 

F ci F(a) ci F(a, fi) 

we see that [F( a, fi ) : F] > m, whence y has degree > m over F, a 
contradiction which proves the lemma. 

Theorem 4.9 (Artin). Let K be a field and let G be a finite group of 
automorphisms of K , of order n. Let F — K G be the fixed field. Then K 
is a Galois extension of F, and its Galois group is G. We have 
IK : F] = n. 

Proof. Let oceK and let o l9 ...,o r be a maximal set of elements of G 
such that o^a, ...,<7 r a are distinct. If tgG then (tc^a, . . . ,ro r a) differs from 
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(dioe, . . . ,o r ot) by a permutation, because t is injective, and every t is 
among the set {^oc, ...,a r oc}; otherwise this set is not maximal. Hence a 
is a root of the polynomial 

fit) = fl (f - cr f 0t), 

i = 1 

and for any tgG, t f = f Hence the coefficients of / lie in K G = F. 
Hence every element a of K is a root of a polynomial of degree ^ n with 
coefficients in F. Furthermore, this polynomial splits in linear factors in 
K. By Lemma 4.8 and Theorem 2.5 we can write K = F(y), and so K is 
Galois over F. The Galois group of K over F has order [K : F] ^ n 
by Corollary 2.4. Since G is a group of automorphisms of K over F, 
it follows that G is equal to the Galois group, thus proving the theorem. 

Remark. Let A be an algebraically closed field, and let G be a non- 
trivial finite group of automorphisms of A. It is a theorem of Artin that 
G has order 2, and that essentially, the fixed field is something like 
the real numbers. For a precise statement, see, for instance, my book 
A Igebra. 

For those who have read the section on Sylow groups, we shall now 
give an application of Galois theory. 

Theorem 4.10. The complex numbers tire algebraically closed. 

Proof. The only facts about the real numbers used in the proof are: 

1. Every polynomial of odd degree with real coefficients has a real 
root. 

2. Every positive real number is the square of a real number. 

We first remark that every complex number has a square root in the 
complex numbers. If you are willing to use the polar form, then the 
formula for the square root is easy. Indeed, if z = re l ° then 

r l/2 e i8/2 

is a square root of z. You can also write down a formula for the square 
root of x + iy (x, y e R) directly in terms of x and y, using only the fact 
that a positive real number is the square of a real number. Do this as an 
exercise. 

Now let E be a finite extension of R containing C. Let K be a Galois 
extension of R containing F. Let G=G KR , and let H be a 2-Sylow 
subgroup of G. Let F be the fixed field. Since [F : R] = (G : H) it follows 
that [F : R] is odd. We can write F = R(a) for some element a by 
Theorem 2.5. Let /(f) be the irreducible polynomial of a over R. Then 
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deg / is odd, whence / has a root in R, whence / has degree 1, whence 
F — R. Hence G is a 2-group. Let G 0 = G K C . If G 0 # { 1}, then being a 
2-group, G 0 has a subgroup // 0 of index 2 by Theorem 9.1 of Chapter II. 
The fixed field of H 9 is then an extension of C of degree 2. But every 
such extension is generated by the root of a polynomial t 2 — /?, for some 
ft 6 C. This contradicts the fact that every complex number is the square 
of a complex number, and concludes the proof of the theorem. 

Remark. The above proof is a variation by Artin of a proof by Gauss. 


VII, §4. EXERCISES 

1. By a primitive n - th root of unity, one means a number ( whose period is 
exactly n. For instance, e 2nl!n is a primitive n - th root of unity. Show that 
every other primitive n - th root of unity is equal to a power e 2nir ' n where r is 
an integer > 0 and relatively prime to n. 

2. Let F be a field, and K = F((), where f is a primitive n - th root of unity. 
Show that K is Galois over F, and that its Galois group is commutative. 
[Flint: For each embedding a over F, note that <j£ = £ r(fT) with some integer 
r(cx).] If t is another embedding, what is t <t£, and ot£? 

3. (a) Let K it K 2 be two Galois extensions of a field F. Say K x = Ffoq) and 

K 2 = F( 2 2 ). Let K = F(oq, y. 2 \ Show that K is Galois over F. Let G be 
its Galois group. Map G into the direct product G K]/F x G KljF by 
associating with each a in G the pair <j 2 \ where is the restriction of 
a to and cr 2 is the restriction of cr to K 2 . Show that this mapping is 
an injective homomorphism. If n K 2 = F, show that the map is an 
isomorphism. 

(b) More generally, let K h ...,K r be finite extensions of a field F contained in 
some field. Denote by K l - ■ K r the smallest field containing K u ...,K r . 
Thus if K i = F(oc i ) then Kj - * = F(oq, . . . ,s r ). Let K=K l ---K r . We 

call K the composite field. Suppose that K u ...,K r are finite Galois 
extensions of F. Show that K is a Galois extension of F. Show that the 
map 

cr i — ► (resj^, tr, . . . ,res* r <T) 

is an injective homomorphism of G K/F into G Ki/f x • • • x G K /F . Finally, 
assume that for each i, 

(K r ~KdnK i+i = F. 

Show that the above map is an isomorphism of G KjF with the product. 
[This follows from (a) by induction.] 

4. (a) Let K be an abelian extension of F and let E be an arbitrary extension of 

F. Prove that KE is abelian over E. 
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(b) Let K l9 K 2 be abelian extensions of F. Prove that K 1 K 2 is abelian over 
F. 

(c) Let K be a Galois extension of F. Prove that there exists a maximal 
abelian subextension of K. In other words, there exists a subfield K' of K 
containing F such that K' is abelian, and if £ is a subfield of K abelian 
over F then £ <= K'. 

(d) Prove that G KjK . is the commutator group of G KjF , in other words, G K/K ' is 
the group generated by all elements 

GT(J _1 T 1 with (7, TGG K j F . 

5. (a) Let K be a cyclic extension of £ and let £ be an arbitrary extension of £. 

Prove that KE is cyclic over £. 

(b) Let K u K 2 be cyclic extensions of £. Is K X K 2 cyclic over £? Proof? 

6. Let K be Galois over £ and let £ be finite over £. Assume that K n £ = £. 

Prove that [KE\K] = [£:£]. 

7. Let £ be a field containing i = v /— 1. Let £ be a splitting field of the 

polynomial t 4 — a with ae F. Show that the Galois group of K over £ is a 

subgroup of a cyclic group of order 4. If t 4 — a is irreducible over £, show 
that this Galois group is cyclic of order 4. If a is a root of t 4 — a , express all 
the other roots in terms of a and i. 

8. More generally, let £ be a field containing all n - th roots of unity. Let £ be a 
splitting field of the equation t n — a — 0 with a e F. Show that K is Galois 
over £, with a Galois group which is a subgroup of a cyclic group of order n. 
If t n — a is irreducible, prove that the Galois group is cyclic of order n. 

9. Show that the Galois group of the polynomial t 4 — 2 over the rational 
numbers has order 8, and contains a cyclic subgroup of order 4. [. Hint : Prove 
first that the polynomial is irreducible over Q. Then, if a is a real fourth root 
of 2, consider K = Q(a, /).] 

10. Give an example of extension fields £ c= £ c= K such that £/£ is Galois, K/E 
is Galois, but K/F is not Galois. 

11. Let K/F be a Galois extension whose Galois group is the symmetric 
group on 3 elements. Prove that K does not contain a cyclic extension of 
£ of degree 3. How many non-cyclic extensions of degree 3 does K 
contain? 

12. (a) Let K be the splitting field of t 4 — 2 over the rationals. Prove that K 

does not contain a cyclic extension of Q of degree 4. 

(b) Let K be a Galois extension of £ with group G of order 8, generated by 
two elements cr, t such that o 4 = 1, t 2 = 1 and tot = a 3 . Prove that K 
does not contain a subfield cyclic over £ of degree 4. 

13. Let K be Galois over £. Suppose the Galois group is generated by two 
elements a, t such that o m = 1, t" = 1, tot -1 = o r where r — 1 > 0 is prime to 
m. Assume that [£:£]= mn. Prove that the maximal subfield K ' of K 
which is abelian over £ has degree n over £. 
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14. Let S n be the symmetric group of permutations of {l,...,rc}. 

(a) Show that S n is generated by the transpositions (12), (13), . . .,(1«). [Hint: 
Use conjugations.] 

(b) Show that S n is generated by the transpositions (12), (23), (34), ...,(/?— 1, n). 

(c) Show that S n is generated by the cycles (12) and (123 - - * /i). 

(d) Let p be a prime number. Let / ^ 1 be an integer with 1 < i ^ p. Show 
that S p is generated by (If) and (123--- p). 

(e) Let a eS p be a permutation of period p. Show that S p is generated by a 
and any transposition. 

15. Let f(t) be an irreducible polynomial of degree p over the rationals, where p 
is an odd prime. Suppose that / has p — 2 real roots and two complex roots 
which are not real. Prove that the Galois group of / is S p . [Use Exercise 
14.] Exercise 8 of Chapter IV, §5 showed how to construct such polynomials. 


The following two exercises are due to Artin. 

16. Let F be a field of characteristic 0, contained in its algebraic closure A. Let 
as A and suppose a $ F, but every finite extension E of F with E ^ F 
contains a. In other words, F is a maximal subfield of A not containing x 
Prove that every finite extension of F is cyclic. 

17. Again let F 0 be a field of characteristic 0, contained in its algebraic closure 
A. Let a be an automorphism of A over F 0 and let F be the fixed field. 
Prove that every finite extension of F is cyclic. 


The next five exercises will show you how to determine the Galois groups of 
certain abelian extensions. 

18. Assume that the field F contains all the n - th roots of unity. Let B be a 
subgroup of F* containing F* n . [Recall that F* is the multiplicative group of 
F, so F*" is the group of n - th powers of elements in F*.] We assume that the 
factor group F/F*" is finitely generated. 

(a) Let K = F(B l,n ), i.e. K is the smallest field containing F and all n - th roots 
of all elements in B. If b l5 ..., b r are distinct coset representatives of B/F * n , 
show that K = F(b\ /n , . . . ,fi r 1/n ), so K is actually finite over F. 

(b) Prove that K is Galois over F. 

(c) Given be B and g e G KjF , let (1 e K be such that /?" = and define the 

Kummer symbol 

<b, g> = gP/P. 

Prove that <b, g ) is an n - th root of unity independent of the choice of /? 
such that ft" = b. 

(d) Prove that the symbol <b, g) is bimultiplicative, in other words; 


<r> = <a, cr)<h, <t> and (b, gt) = (b, a)<b, t>. 
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(e) Let be B be such that <(fr, a) = 1 for all geG k/f . Prove that beF* n . 

(f) Let geG k/f be such that < b , <r) = 1 for all be B. Prove that g = id. 


Remark. As in Exercise 18, assume that the n- th roots of unity lie in F. Let 
,« A .e F*. There is in most people a strong feeling that the Galois group of 
F(m \ la , . . . ,4 S 1/N ) should be determined only by multiplicative relations between 
a s . Exercise 18 is the main step in formulating this idea precisely. The next 
exercises show you how to complete this idea. Note that Exercises 19, 20, 21 
concern only finite abelian groups, and not fields. The application to field theory 
comes in Exercise 22. 

19. (a) Let A be a cyclic group of order n. Let C be a cyclic group of order n. 

Show that the group of homomorphisms of A into C is cyclic of order n. 

We let Hom(,4, C) denote the group of homomorphisms of A into C. 

(b) Let A be a cyclic group of order d, and assume d\n. Let again C be a 

cyclic group of order n. Show that Hom(/t, C) is cyclic of order d. 

20. Let A = A l x * - - x A r be a product of cyclic groups of orders d h ...,d r 
respectively, and assume that d { \n for all i. Prove that 


Horn (4, C) % fl Horn (A h C), 

i = 1 


and hence that Hom(4, C) is isomorphic to A , by using Exercise 19(b). 

21. Let A , B be two finite abelian groups such that x n = 1 for all xeA and all 
xeB. Suppose given a bimultiplicative mapping 

A x B -» C denoted by («, Z?) i — ► <ji, 6) 


into a cyclic group C of order n. Bimultiplicative means that for all 4, * e A 
and b , b' e B we have 


<««', b ) = <«, 6)<«', b) and <«, bb ') = <«, &><«, 6'). 

Define a perpendicular to b to mean that <(«, b ) = 1. Assume that if 4 e A is 
perpendicular to all elements of B then 4 = 1, and also if beB is perpendicu- 
lar to all elements of A, then b — 1. Prove that there is a natural 
isomorphism 


A % Homffi, C), 

and hence by Exercise 20, that there is some isomorphism A % B. 

22. In Exercise 18, prove that G K/F % B/F* n . 


292 


FIELD THEORY 


[VII, §5] 


VII, §5. QUADRATIC AND CUBIC EXTENSIONS 

In this section we continue to assume that 
all fields have characteristic 0. 


Quadratic polynomials 

Let F be a field. Any irreducible polynomial t 2 + bt + c over F has a 
splitting field F(oc), with 


— b± yjb 2 — 4c 

« = -V 

Thus F(a) is Galois over F, and its Galois group is cyclic of order 2. 
If we let d = b 2 — 4c, then F(a) = F(y/ d). Conversely, the polynomial 
t 2 — d is irreducible over F if and only if d is not a square in F. 


Cubic polynomials 

Consider now the cubic case. Let 

f(t) = (t- oCiX* - oc 2 )(t - oc 3 ) e Fit] 

be a cubic polynomial with coefficients in F. The roots may not be in F, 
however. We let the discriminant of / be defined as 


D = [(oc 2 - axXaa - a 1 Xa 3 - « 2 )] 2 - 

Any automorphism of F(a l9 a 2 , a 3 ) leaves D fixed because it changes the 
product 

A = (a 2 — «iX«3 - *iX«3 “ * 2 ) 


at most by a sign. This is a special case of the proof of Chapter II, §6. 
Let K be the splitting field, so 


K = F( a 1 ,a 2 ,a 3 ). 


Given a e G K/F we let 

ex A = e((j) A with e(g) = 1 or — 1. 


Then the map 


a i— > £(<j) 
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is immediately verified to be a homomorphism of G K/F into the cyclic 
group with two elements. The kernel of this homomorphism consists of 
those g which induce an even permutation of the roots. In any case we 
have an injective homomorphism 

Gk/f ^3 

which to each g associates the corresponding permutation of the roots. 
Since S 3 has order 6, it follows that G K/F has order dividing 6, so G K/F 
has order 1, 2, 3, or 6. We let A 3 be the alternating group, i.e. the 
subgroup of even permutations in S 3 . One possible question is whether 
G k/f is isomorphic to A 3 or to S 3 . 

Note that D = A 2 and DeF. This follows from the general theory of 
the discriminant, which is a symmetric function of the roots, and 
therefore a polynomial in the coefficients of / with integer coefficients by 
Chapter IV, Theorem 8.1. Since the coefficients of / are in F, the discrimi- 
nant is in F. We can also see this result from Galois theory, since D is fixed 
under the Galois group G K / F . 

Theorem 5.1. Let f be an irreducible polynomial over F, of degree 3. 
Let K be its splitting field and G = G KjF . Then G is isomorphic to S 3 if 
and only if the discriminant D of f is not a square in F. If D is a 
square in F, then K has degree 3 over F and G K/F is cyclic of degree 3. 

Proof Suppose D is a square in F. Since the square root of D is 
uniquely determined up to a sign, it follows that AeF. Hence A is fixed 
under the Galois group. This implies that G K/F is isomorphic to a 
subgroup of the alternating group which has order 3. Since [K : F] =3 
because / is assumed irreducible, we conclude that G K/F is isomorphic to 
A 3 , and therefore G K/F is cyclic of order 3. 

Suppose D is not a square in F. Then 

[F(A) : F] = 2. 

Since by irreducibility [F(a) : F] = 3, it follows that [F(a, A) : F] = 6, 
whence G K/F has order at least 6. But G K/F is isomorphic to a subgroup 
of S 3 , so G k/f has order exactly 6 and is isomorphic to S 3 . This 
concludes the proof. 

Actually the proofs of the two cases in Theorem 5.1 has also given us 
a proof of the following result. 

Theorem 5.2. Let f(t) = t 3 + • - • be a polynomial of degree 3, irreducible 
over the field F. Let D be the discriminant and let cl be a root. Then 
the splitting field of f is K = F( > /D, cl). 
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After replacing the variable t by t — c with a suitable constant ceF if 
necessary a cubic polynomial with leading coefficient 1 in F[t] can be 
brought into the form 

f{t) = t 3 + at + b = (t - ocjXt - a 2 )(i - a 3 ) 

with a , beF. The roots may or may not be in F. If / has no root in F, 
then / is irreducible. We find 


a l + a 2 + a 3 = 0, OCj a 2 + 2^3 T a 2 a 3 = — «l«2 a 3 = b. 

As in Chapter IV, §8 the discriminant has the form 


D = -4a 3 -27b 2 . 


Theorems 5.1, 5.2 and this formula now give you the tools to determine 
the Galois group of a cubic polynomial, provided you have a means of 
determining its irreducibility. 

We emphasize that before doing anything else, one must always deter- 
mine the irreducibility of /. 


Example. We consider the polynomial f(t) = t 3 — 3t + 1. It has no 
integral root, and hence is irreducible over Q. Its discriminant is 

D = -4a 3 - 27 b 2 = 3 4 . 


The discriminant is a square in Q, and hence the Galois group of / over 
Q is cyclic of order 3. The splitting field is Q(a) for any root a. 


VII, §5. EXERCISES 

1. Determine the Galois groups of the following polynomials over the rational 
numbers. 

(a) t 2 — t + 1 (b) t 2 — 4 (c) t 2 4- t 4- 1 (d) t 2 — 27 

2. Let f{t) be a polynomial of degree 3, irreducible over the field F. Prove that 

the splitting field K of / contains at most one subfield of degree 2 over F 

namely F(^D), where D is the discriminant. If D is a square in F, then K 

does not contain any subfield of degree 2 over F. 
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3. Determine the Galois groups of the following polynomials over the rational 
numbers. Find the discriminants. 

(a) t 3 - 3r + 1 (b) t 3 4- 3 (c) t 2 - 5 

(d) t 3 — a where a is rational, ^ 0, and is not a cube of a rational number 

(e) t 3 - 5t + 7 (f) t 3 + 2t + 2 (g) t 3 - t - 1 


4. Determine the Galois groups of the following polynomials over the indicated 
field. 


(a) t 3 — 10 over Q( v ^2) 

(c) t 3 — t — 1 over Q^ — 23) 

(e) t 3 — 2 over Q( v — 3) 

(g) t 2 - 5 over Q(^/^5) 


(b) t 3 — 10 over Q 
(d) t 3 — 10 over Q( v ' — 3) 

(f) t 3 - 9 over Q( > /^3) 
(h) t 2 + 5 over Q(y — 5) 


5. Let F be a field and let 


f(t) = t 3 + a 2 t 2 + af + a 0 and #(t) = t 2 - c 

be irreducible polynomials over F. Let D be the discriminant of f Assume 
that 

[F(D 1/2 ) : F] = 2 and F(D X/2 ) # F(c 1/2 ). 

Let a be a root of / and (3 a root of g in an algebraic closure. Prove: 

(a) The splitting field of f(t)g(t) over F has degree 12. 

(b) Let y = j. + (3. Then [F(y) : F] = 6. 

6. Let f g be irreducible polynomials of degree 3 over the field F. Let D f , D g be 
their discriminants. Assume that D f is not a square in F but D g is a square 
in F. 

(a) Prove that the splitting field of fg over F has degree 18. 

(b) Let S 3 be the symmetric group on 3 elements, and let C 3 be a cyclic 
group of order 3. Prove that the Galois group of fg over F is isomorphic 
to S 3 x C 3 . 

7. Let f(t) = t 3 + at + b. Let a be a root, and let [3 be a number such that 


Show that such a (3 can be found if a ^ 0. Show that 

P 3 = -b / 2 + y^D/108. 

In this way we get an expression of a in terms of radicals. 

For the next exercises, recall the following result, or prove it if you have not 
already done so. 

Let a , beF and suppose F(y/a) and F(^/b) have degree 2 over F. Then 
F(y/a) = F(^/b) if and only if there exists ce F such that a = c 2 b. 
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8. Let f(t) = t 4 4- 30 1 2 + 45. Let a be a root of / Prove that Q(a) is cyclic of 
degree 4 over Q. [Hint: Note that to solve the equation, you can apply the 
quadratic formula twice.] 

9. Let f(t) = t 4 4- 4t 2 + 2. 

(a) Prove that f(t) is irreducible over Q. 

(b) Prove that the Galois group over Q is cyclic. 

10. Let K = Q(y/2, Jl, a) where a 2 = (9 — 5 > /3)(2 — ^Jl). 

(a) Prove that K is a Galois extension of Q. 

(b) Prove that G K/Q is not cyclic but contains a unique element of order 2. 

11. Let [F : F] = 4. Prove that E contains a subfield L with [L : F] = 2 if and 
only if E = F(a) where a is a root of an irreducible polynomial in F[t] of the 
form t 4 + fit 2 + c. 


VII, §6. SOLVABILITY BY RADICALS 

We continue to assume that all fields are of characteristic 0. 

A Galois extension whose Galois group is abelian is said to be an abe- 
lian extension. Let K be a Galois extension of F, K = F(a). Let a, r be 
automorphisms of K over F. To verify that <tt = t g it suffices to verify 
that err a = T<xa. Indeed, any element of K can be written in the form 

x = a 0 + a x 0 L + •*• + a d - x 0L d ~ l 


if d is the degree of a over F. Since cxa i = raa, for all i, it follows that 
if in addition om = Tda, then am 1 = tgol 1 for all i, whence tgx = gtx. 
We shall describe two important cases. 


Theorem 6.1. Let F be a field and n a positive integer. Let £ be a 
primitive n-th root of unity , that is £ n = 1, and every n-th root of unity 
can be written in the form C r for some r, 0 !§ r < n. Let K = F(Q. 
Then K is Galois and abelian over F. 

Proof Let a be an embedding of K over F. Then 


K) n = c(t n ) = 1 . 


Hence <xC is also an n-th root of unity, and there exists an integer r such 
that <xC = C r - In particular, K is Galois over F. Furthermore, if r is 
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another automorphism of K over F, then t£ = for some s, and 
<7TC = <X(0 = <7(0* = C rS = T^C. 

Hence at = tcj, and the Galois group is abelian, as was to be shown. 

Theorem 6.2. Let F be a field and assume that the n-th roots of unity 
lie in F. Let aeF. Let a be a root of the polynomial t n — a, so a” = a, 
and let K = F(a). Then K is abelian over F , and in fact K is cyclic 
over F, that is G KjF is cyclic. 

Proof Let o be an embedding of K over F. Then 


(oxf = <r(a n ) = g a — a. 


Hence gol is also a root of t n — a , and 


(i G(x/(x) n = 1. 

Hence there exists an n-th root of unity (not necessarily primitive) 
such that 

gol = 

(Note that we index by a to denote its dependence on g.) In particu- 
lar, K is Galois over F. Let t be any automorphism of K over F. Then 
similarly there is a root of unity such that 


In addition, 


m = C r a. 


<7(r(a)) = <r(£ t a) = C r rr(a) = ^a, 


because the roots of unity are in F, and so are left fixed by o. Therefore 
the association 

is a homomorphism of the Galois group G KjF into the cyclic group of 
n-th roots of unity. If = 1 then g is the identity on a, and therefore 
the identity on K. Hence the kernel of this homomorphism is 1. There- 
fore G k/f is isomorphic to a subgroup of the cyclic group of n-th roots of 
unity, and is therefore cyclic. This concludes the proof. 

Remark. It may of course happen that in Theorem 6.2 the Galois 
group G K/f is cyclic of order less than n. For instance, a could be a d-th 
power in F with d\n. However, if f — a is irreducible, then G KjF has 
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order n. Indeed, we know that [K : F] ^ n by irreducibility, so G KjF has 
order at least n , whence G KjF has order exactly n since it is a subgroup of 
a cyclic group of order n. 

For a concrete example, let F = Q(Q where C is a primitive cube root 
of unity, so F = Q( v / — 3). Let i be a root of t 9 — 27. Then 

[F(a):F]-3. 

Let F be a field and / a polynomial of degree n ^ 1 over F. We shall 
say that / is solvable by radicals if its splitting field is contained in a 
finite extension K which admits a sequence of subfields 

f = F„ <= Fj <= F 2 <=■••<= F ra = K 

such that: 

(a) K is Galois over F. 

(b) F x = F(Q for some primitive n- th root of unity £. 

(c) For each i with 1 ^ i ^ m — 1, the field F i+1 can be written in the 
form F i + 1 = F i (oi i+1 ), where a f+1 is a root of some polynomial 

t d ‘ - a ; = 0, 

where d t divides n , and a t is an element of F f . 

Observe that if d divides n, then is a primitive d-th root of unity 
(proof?) and hence by Theorems 6.1 and 6.2 the extension F i+1 of F f is 
abelian. We also have seen that F x is abelian over F. Thus K is 
decomposed into a tower of abelian extensions. Let G ( be the Galois 
group of K over F f . Then we obtain a corresponding sequence of groups 

G 3 G x => G 2 =>•••=> G w = {e} 

such that G I + 1 is normal in G f , and the factor group G t /G i+1 is abelian 
by Theorem 6.2. 

Theorem 6.3. If f is solvable by radicals , then its Galois group is 

solvable. 

Proof Let L be the splitting field of /, and suppose that L is 
contained in a Galois extension K as above. By the definition of a 
solvable group, it follows that G KjF is solvable. But G L/F is a factor group 
°f g k/f , and so G LjF is solvable. This proves the theorem. 

Remark. In the definition of solvability by radicals, we built in from 
the start the hypothesis that K is Galois over F. In the exercises, you 
can develop a proof of a slightly stronger result, without making this 
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assumption. The reason for making the assumption was to exhibit clearly 
and briefly the essential part of the argument, which from solvability by 
radicals implies the solvability of the Galois group. The steps given in 
the exercises will be of a more routine type. 

The converse is also true: if the Galois group of an extension is 
solvable, then the extension is solvable by radicals. This takes somewhat 
more arguments to prove, and you can look it up in a more advanced 
text, e.g. my Algebra. 

It was a famous problem once to determine whether every polynomial 
is solvable by radicals. To show that this is not the case, it will suffice 
to exhibit a polynomial whose Galois group is the symmetric group S 5 
(or S n for n ^ 5), according to Theorem 6.3 of Chapter II, §6. This is 
easily done: 

Theorem 6.4. Let x 1? ...,x n be algebraically independent over a field F 0 , 

and let 


m= n((-x i ) = c-s 1 r 1 + - + (-i)\, 

i = 1 


where 

Si=x t + + x„,...,s„ = x x ■■•x B 

are the coefficients of /. Let F = F 0 (s l9 . . . ,s„). Let K = F(x 1? . . . ,x„). 
Then K is Galois over F, with Galois group S n . 

Proof Certainly K is a Galois extension of F because 

K = F(x l5 ...,x n ) 

is the splitting field of /. Let G = G K / F . By general Galois theory, we 
know that there is an injective homomorphism of G into S n : 


which to each automorphism a of K over F associates a permutation of 
the roots. So we have to prove that given a permutation n of the roots 
x l9 ...,x n there exists an automorphism of K over F which restricts to this 
permutation. But (^(Xi), . . . ,7 c(x„)) being a permutation of the roots, the 
elements ^(xj, . .. ,7t(x n ) are algebraically independent, and by the basic 
fact of polynomials in several variables, there is an isomorphism 
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which extends to an isomorphism of the quotient fields 
F 0 (x u . . . ,x„) -> F 0 (7t(Xi), . . . ,n(x„)), 

sending x t - on 7r(x { ) for i=l, This isomorphism is therefore an 

automorphism of F 0 (x 1? . . . ,x n ), which leaves the ring of symmetric 
polynomials F 0 [s 1? sj fixed, and therefore leaves the quotient field 
F 0 (s 1? ... ,s„) fixed. This proves that the map is surjective on S n , 

and concludes the proof of the theorem. 

In the next section, we shall show that one can always select n com- 
plex numbers algebraically independent over Q. 

In some sense, in exhibiting a Galois extension whose Galois group is 
S n we have “cheated”. We really would like to see a Galois extension of 
the rational numbers Q whose Galois group is S n . This is somewhat 
more difficult to achieve. It can be shown by techniques beyond this 
book that for a lot of special values s l5 ...,s n in the rational numbers, the 
polynomial 


7(r)~ f-M"" 1 + ■■■ + (- \)% 

has the symmetric group as its Galois group. 

Also the polynomial t 5 — t — 1 has the symmetric group as its Galois 
group. You can refer to a more advanced algebra book to see how to 
prove such statements. 

Remark. Radicals are merely the simplest way of expressing irrational 
numbers. One can ask much more general questions, for instance along 
the following lines. Let a be an algebraic number, that is the root of a 
polynomial of degree ^ 1 with rational coefficients. Let us start with the 
question: Is there a root of unity £ such that aeQ(£)? We suppose Q(£) 
is embedded in the complex numbers. We can write 


C = e 


Inirjn 


where r, n are positive integers. If we define the function 

f(z) = e 2 *-, 

then our question amounts to whether aeQ(/(a)) for some rational 
number a. Now the question can be generalized, by taking for / an 
arbitrary classical function, not just the exponential function. In analysis 
courses, you may have heard of the Bessel function, the gamma function, 
the zeta function, various other functions, such as solutions of differential 
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equations, whatever. The question then runs almost immediately into 
unsolved problems, and leads into a mixture of algebra, number theory, 
and analysis. 


VII, §6. EXERCISES 

1. Let K 2 be Galois extensions of F whose Galois groups are solvable. Prove 
that the Galois groups of K 1 K 2 and K 1 n K 2 are solvable. 

2. Let K be a Galois extension of F whose Galois group is solvable. Let E be 
any extension of F. Prove that KE/E has solvable Galois group. 

3. By a radical tower over a field F we mean a sequence of finite extensions 

F = c F i c ... c F r , 

having the property that there exist positive integers d ( , elements tL { e F ( and a f 
with of = such that 

F i+l = F^). 

We say that E is contained in a radical tower if there exists a radical tower as 
above such that E cz F r . 

Let £ be a finite extension of F. Prove: 

(a) If E is contained in a radical tower and E is a conjugate of E over F, then 
E is contained in a radical tower. 

(b) Suppose E is contained in a radical tower. Let L be an extension of 
F. Then EL is contained in a radical tower of L. 

(c) If Ei, E 2 are finite extensions of F, each one contained in a radical tower, 
then the composite E 1 E 2 is contained in a radical tower. 

(d) If E is contained in a radical tower, then the smallest normal extension of 
F containing E is contained in a radical tower. 

(e) If F 0 c= • • • c= F r is a radical tower, let K be the smallest normal extension 
of F 0 containing F r . Then K has a radical tower over F. 

(f) Let £ be a finite extension of F and suppose E is contained in a radical 
tower. Show that there exists a radical tower 

F c £ 0 c= El c • • • c E m 

such that: 

E m is Galois over F and E c= E m ; 

E 0 = F(Q where f is a primitive n - th root of unity; 

For each /, E i+l = F t -( a ;) where otf = a t e E t and d^n. 

Thus if E is contained in a radical tower, then E is solvable by radicals in 
the sense given in the text. The property of being contained in a radical 
tower is closer to the naive notion of an algebraic element being expressible 
in terms of radicals, and so we gave the development of this exercise to 
show that this naive notion is equivalent to the notion given in the text. 
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VII, §7. INFINITE EXTENSIONS 

We begin with some cardinality statements concerning fields. We use 
only denumerable or finite sets in the present situation, and all that we 
need about such sets is the definition of Chapter X, §3 and the following: 
If D is denumerable, then a finite product D x • x D is denumerable. 
A denumerable union of denumerable sets is denumerable. 

An infinite subset of a denumerable set is denumerable. 

If D is denumerable, and D -> S is a surjective map onto some set S 
which is not finite, then S is denumerable. 

The reader will find simple self-contained proofs in Chapter X (cf. 
Theorem 3.2 and its corollaries), and for denumerable sets, these state- 
ments are nothing but simple exercises. 

Let F be a field, and E an extension of F. We shall say that E is 
algebraic over F if every element of E is algebraic over F. Let A be an 
algebraically closed field containing F. Let F a be the subset of A consist- 
ing of all elements which are algebraic over F. The superscript “a” 
denotes “algebraic closure” of F in A. Then F a is a field, because we 
have seen that whenever a, ft are algebraic, then a + ft and a/? are 
algebraic, being contained in the finite extension F( a, p) of F. 

Theorem 7.1. Let F be a denumerable field. Then F a is denumerable. 

Proof. We proceed stepwise. Let P n be the set of irreducible poly- 
nomials of degree n ^ 1 with coefficients in F and leading coefficient 1. 
To each polynomial feP n , 

/( 0 = t n + a n-\t n 1 T *** + a o> 

we associate its coefficients (a n _ 1 , . . . ,a 0 ). We thus obtain an injection of 
P n into F x • • • x F = F n , whence we conclude that P n is denumerable. 

Next, for each feP n , we let cxy l5 ...,a f n be its roots, in a fixed order. 
Let J n = {!,... ,w}, and let 

P n x {1, . . . ,n} -> A 

be the map of P n x J n into A such that 


(/, 0 i-> a fti 

for i= l,..., n and feP n . Then this map is a surjection of P n x J n onto 
the set of numbers of degree n over F, and hence this set is denumerable. 
Taking the union over all n= 1, 2,... we conclude that the set of all 
numbers algebraic over F is denumerable. This proves our theorem. 
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Theorem 7.2. Let F be a denumerable field. Then the field of rational 
functions F(t) is denumerable. 

Proof. It will suffice to prove that the ring of polynomials F[r] is 
denumerable, because we have a surjective map 

Fit] x Fit ] o -> F(r), 

where F[t] 0 denotes the set of non-zero elements of Fit]. The map is of 
course ( a , b) i— ► a/b. For each n , let P n be the set of polynomials of de- 
gree ^ n with coefficients in F. Then P n is denumerable, and hence F[t] 
is denumerable, being the denumerable union of F 0 , P u P 2 >- ■ together 
with the single element 0. 

Note. The fact that R (and hence C) is not denumerable will be 
proved in Chapter IX, Corollary 4.4. 

Corollary 7.3. Given an integer n ^ 1, there exist n algebraically inde- 
pendent complex numbers over Q. 

Proof The field Q a is denumerable, and C is not. Hence there exists 
x x eC which is transcendental over Q a . Let F i = Q*(x l ). Then F x is 
denumerable. Proceeding inductively, we let x 2 be transcendental over 
F\, and so on, to find our desired elements x l9 x 2 , . . . ,x n . 

The complex numbers form a convenient algebraically closed field of 
characteristic 0 for many applications. 

Theorem 7.4. Let F be a field. Then there exists an algebraic closure of 
F, that is, there exists a field A algebraic over F such that A is 
algebraically closed. 

Proof. In general, we face a problem which is set-theoretic in nature. 
In Exercise 9 of Chapter X, §3 you will see how to carry out the proof in 
general. We give the proof here in the most important special case, when 
F is denumerable. All the basic features of the proof already occur in this 
case. 

As in Theorem 7.1 we give an enumeration of all polynomials of 
degree ^1 over F, say {/i,/ 2 ,/ 3 , . . .}. By induction we can find a 
splitting field K x of over F, then a splitting field K 2 of j\ f 2 over F 
containing K 1 as subfield; then a splitting field of /1/2/3 containing K 2 \ 
and so on. In general, we let K n+1 be a splitting field of fi f 2 '"f n +i 
containing K n . We take the union 


A= U 

n= 1 
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Observe first that A is a field. Two elements of A lie in some K n , so 
their sum, product and quotient (by a non-zero element) are defined in 
K n , and since K n is a subfield of K m for m > n, this sum, product and 
quotient do not depend on the choice of n. Furthermore, A is algebraic 
over F because every element of A lies in some K n , which is finite over F. 
We claim that A is algebraically closed. Let a be algebraic over A. Then 
a is the root of a polynomial f(t)eA\_t\ and / actually has coefficients in 
some field K n , so a is algebraic over K n . Then K n ( a) is algebraic, so finite 
over F, so a lies in K m for some m, whence cue A. This concludes the 
proof. 

Next, we deal with the uniqueness of an algebraic closure. 

Theorem 7.5. Let A , B be algebraic closures of F. Then there exists an 

isomorphism a: A -> B over F (that is , a is the identity on F). 

Proof Again in general we meet a set-theoretic difficulty which 
disappears in the denumerable case, so we give the proof when F is 
denumerable. By an argument similar to the one used in the previous 
theorem, we write A as a union 

' 4 = 0 *. 

n = 1 


where K n is finite normal over F and K n (=K n + l for all n. By the 
embedding theorem for finite extensions, there exists an embedding 

(j i * K i — > B 

which is the identity on F. By induction, suppose we have obtained an 
embedding 

a n : K n -> B 

over F. By the embedding theorem, there exists an embedding 

a n+ 1 : K„ + l — ► B 

which is an extension of cr„. Then we can define (7 on 4 as follows. 
Given an element x e A, there is some n such that xeK n . The element 
o n x of B does not depend on the choice of n , because of the compatibility 
condition that if m > n then the restriction of a m to K n is cr„. We define 
ax to be a n x. Then a is an embedding of A into B which restricts to a n 
on K n . It will now suffice to prove: 

Let A , B be algebraic closures of F and let a: A -> B be an embedding 
over F. Then B = a A, so a is an isomorphism . 
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Proof. Since A is algebraically closed, it follows that a A is algebraic- 
ally closed, and a A a B. Since B is algebraic over F , it follows that B is 
algebraic over a A. Let y e B. Then y is algebraic over a A, and therefore 
lies in crA, so B = a A, as was to be shown. This concludes the proof of 
Theorem 7.5. 


Remark. The element a in the above proof was defined in a way 
which could be called a limit of a sequence of embeddings o n . The study 
of such limits would constitute a further chapter in field theory. We shall 
not go into this matter except by giving some examples as exercises. The 
study of such limits is also analogous to the considerations of Chapter 
IX, which deals with completions. 

Just as in the finite case, we can speak of the group of automorphisms 
Aut(A/F), and this group is also called a Galois group for the possibly 
infinite extension A of F. 

More generally, let 


K= U K 

n~ 1 


n 


be a union of finite Galois extensions K n of F, such that K n a K n + 1 for 
all n. Then we let 


G m = Aut(K/F) 

be the group of automorphisms of K over F. Each such automorphism 
restricts to an embedding of K n , which must be an automorphism of K n . 
By the extension theorem and an inductive definition as in the proof of 
the uniqueness of the algebraic closure, we conclude that the restriction 
homomorphism 


res: G KjF — > G KnjF 

is surjective. Its kernel is G KfKn . A major problem is to determine G K/F 
for various infinite extensions, and especially when K = Q a is the 
algebraic closure of the rational numbers. In an exercise, you can see an 
example for a tower of abelian extensions. The point is that automor- 
phisms in G KjF are limits of sequences {cr n ) of automorphisms of the finite 
Galois extensions KJF. Thus the consideration of infinite extensions 
leads to a study of limits of sequences of elements in Galois groups, and 
more general types of groups. In the next chapter we shall meet two 
basic examples: the extensions of finite fields, and extensions generated by 
roots of unity. 
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VII, §7. EXERCISES 

1. Let {G„} be a sequence of multiplicative groups and for each n suppose given a 
surjective homomorphism 


K + i : G n+ 1 -> G„. 

Let G be the set of all sequences 

(s t , s 2 , s 3 , ...) with s„eG„ 

satisfying the condition h n s n = s n - v Define multiplication of such sequences 
componentwise. Prove that G is a group, which is called the projective limit of 
{G„}. If the groups G n are additive groups, then G is an additive group in a 
similar way. 

Examples. Let p be a prime number. Let Z (p n ) = Z/ p n Z, and let 
h n + l : Z/p n + 1 Z -► Z/p”Z 

be the natural homomorphism. The projective limit is called the group 
of p-adic integers, and is denoted by Z p . 

2. (a) Using the fact that /i M + 1 is actually a surjective ring homomorphism, show 
that Z p is a ring in a natural way. 

(b) Prove that Z p has a unique prime ideal which is pZ p . 

Let again p be a prime number. Instead of Z (p n ) as above, consider the 
group of units in the ring Z/p"Z, so let G n = (! Z/p n Z )*. We can define 

h*+ 1 : (Z/p n+ *Z)* (Z/p"Z)* 

to be the restriction of /i M+1 , and you will see immediately that h* +l is 
surjective. Cf. the exercises of Chapter II, §7. The projective limit of 
{(Z/p"Z)*} is called the group of p- adic units, and is denoted by Z*. Thus a 
p-adic unit consists of a sequence 

(“i, m 2 ,m 3 ,..., 


where each u n e ( Z/p n Z )* and 


“« + i = K mod p\ 

Each element u„ can be represented by a positive integer prime to p, and is 
well-defined mod p". 

In Chapter VIII, §5 you will find an application of the projective limit of 
the groups (Z/p"Z)* to the Galois theory of roots of unity. If you read 
Theorem 5.1 of Chapter VIII, you can do right away Exercise 12 of VIII, §5 
following the above discussion. However, you can now do the following 
exercise, which depends only on the notions and results which have already 
been dealt with. 


[VII, §7] 


INFINITE EXTENSIONS 


307 


3. Let F be a field and let {/C M } be a sequence of finite Galois extensions such 
that K n a K n+1 . Let 


K = U Kn and G„ = Gal (KJF). 

1 

By finite Galois theory, the restriction homomorphism G n + : -► G n is surjective. 
Define a natural map 


lim G n - Aut(K/F), 


and prove that your map is an isomorphism. The limit is the projective limit 
defined in Exercise 1. 


Aut (K/F) 


K 




G 


n + 1 


The next exercises give another approach to the limits which we have just 
considered, in another context which will relate to the context of Chapter IX. 
You may wish to postpone these exercises until you read Chapter IX. 

4. Let G be a group. Let be the family of all subgroups of finite index. Let 
{x„} be a sequence in G. We define this sequence to be Cauchy if given HeJ 
there exists n 0 such that for m, n ^ n 0 we have x n x~ { eH . If {x„}, {y M } are two 
sequences, define their product to be the sequence {x M y M }. 

(a) Show that the set of Cauchy sequences forms a group. 

(b) Define {x„} to be a null sequence if given there exists n 0 such that 

for n^.n 0 we have x n eH. Show that the null sequences form a normal 
subgroup. 

The factor group of all Cauchy sequences modulo the null sequences is called 
the completion of G. Note that we have not assumed G to be commutative. 

(c) Prove that the map which sends an element x e G on the class of the 
sequence (x, x, x, . . .) modulo null sequences, is a homomorphism of G into 
the completion, whose kernel is the intersection 


n h . 

He.? 


It may be useful to you to refer to the exercises of Chapter II, §4, where 
you should have proved that a subgroup H of finite index always contains 
a normal subgroup of finite index. 

5. Instead of the family of all subgroups of finite index, let p be a prime 
number, and let 3* be the family of all normal subgroups whose index is a 
power of p. Again define Cauchy sequences and null sequences and prove the 


308 


FIELD THEORY 


[VII, §7] 


analogous statements of (a), (b), (c) in Exercise 4. This time, the completion is 
called the p-adic completion. 

6. Let G = Z be the additive group of integers, and let & be the family of 

subgroups p”Z where p is a prime number. Let R p be the completion of Z in 

the sense of Exercise 5. 

(a) If {x„} and {y„} are Cauchy sequences, show that {x„y M } is a Cauchy 
sequence, and prove that R p is a ring. 

(b) Prove that the map which to each x e Z associates the class of the sequence 
(x, x, x, ...) modulo null sequences gives an embedding of Z into R p . 

(c) Prove that R p has a unique prime ideal, which is pR p . 

(d) Prove that every ideal of R p has the form p m R p for some integer m. 

7. Let {x„} be a sequence in R p and let xeR p . Define x = lim x n if given a 

positive integer r, there exists n Q such that for all n ^ n 0 we have x — x n e p r R p . 

Show that every element a e R p has a unique expression 


a = Y, m i P 1 with 0 ^ m, 5^ p — 1 . 

i = 0 

The infinite sum is by definition the limit of the partial sums 
lim (m 0 + m 1 p + • • • + m N p N ). 

N ->* 0 

8. Let G = Z be the additive group of integers, and let SF be the family of 
subgroups p"Z where p is a prime number. Let R p be the completion of Z in 
the sense of Exercises 5, 6, and let Z p be the projective limit of Z/p”Z in the 
sense of Exercise 1. Prove that there is an isomorphism Z p -> R p . [In practice, 
one does not distinguish between Z p and R p , and one uses Z p to denote the 
completion.] 


CHAPTER VIII 


Finite Fields 


It is worth while to consider separately finite fields, which exhibit some 
very interesting features. We preferred to do Galois theory in character- 
istic 0 first, in order not to obscure the basic ideas by the special phe- 
nomena which can occur when finite fields are involved. On the other 
hand, finite fields occur so frequently that we now deal with them more 
systematically. 


VIII, §1. GENERAL STRUCTURE 


Let F be a finite field with q elements. Let e be the unit element of F. 
As with any ring, there is a unique ring homomorphism 


such that 


Z — ► F 


n\-^ne = e + e + • • • + e. 


n times 


The kernel is an ideal of Z, which we know is principal, and cannot be 0 
since the image is finite. Let p be the smallest positive integer in that 
ideal, so p generates the ideal. Then we have an isomorphism 

Z/pZ - F, 

between ZjpZ and its image in F. Denote this image by V p . Since is 
a subfield of F, it has no divisors of 0, and consequently the ideal pZ is 
prime. Hence p is a prime number, uniquely determined by the field F. 
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We call p the characteristic of F, and the subfield F p is called the prime 
field. 

As an exercise, prove that Z//?Z has no automorphisms other than the 
identity. We then identify Z//?Z with its image F p . This is possible in 
only one way. We write 1 instead of e. 

Theorem 1.1. The number of elements of F is equal to a power of p. 

Proof We may view F as a vector space over F p . Since F has only a 
finite number of elements, it follows that the dimension of F over F p is 
finite. Let this dimension be n. If {w l9 w 2 ,. . . ,w„} is a basis, then every 
element of F has a unique expression 

x = a 1 w 1 + •■■ + a n w n , 

with elements a t in F . Since the choice of these a { is arbitrary, it follows 
that there are p n possible elements in F, thus proving that q = p n , as 
desired. 

The multiplicative group F* of non-zero elements has q — 1 elements, 
and they all satisfy the equation 

x* 1 — 1 = 0 if x e F*. 

Hence all elements of F satisfy the equation 

x q -x = 0. 

(Of course, the only other element is 0.) 

In Chapter IV, we discussed polynomials over arbitrary fields. Let us 
consider the polynomial 


At) = t q -t 

over the finite field F p . It has q distinct roots in the field F, namely all 
elements of F. The proof that a polynomial of degree n has at most n 
roots applies as well to the present case. Hence if K is another finite 
field containing F, then t q — t cannot have any roots in K other than the 
elements of F. 

If we use the definition of a splitting field as in the previous chapter, 
we then find: 

Theorem 1.2. The finite field F with q elements is the splitting field of 
the polynomial t q — t over the field Z/pZ. 
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By Theorem 3.2 of Chapter VII, two finite fields with the same 
number of elements are isomorphic. We denote the field with q elements 
by Fg. In particular, consider the polynomial 


It has p roots, namely the elements of the prime field. Therefore the ele- 
ments of F p are precisely the roots of t p — t in F q . 

In the previous chapter, we had used an algebraically closed field right 
from the start as a matter of convenience. Let us do the same thing 
here, and postpone to a last section the discussion of the existence of 
such a field containing our field F. Thus: 

We assume that all of our fields are contained in an algebraically closed 

field A , containing F p = ZjpZ. 

In Theorem 1.2 we started with a finite field F with q elements. We 
may ask for the converse: Given q = p n equal to a power of p, what is 
the nature of the splitting field of t q — t over the field F p = Z/pZ? 

Theorem 1.3. Given q = p n the set of elements xe A such that x q = x is 

a finite field with q elements. 

Proof We first make some remarks on binomial coefficients. In the 
ordinary binomial expansion 


that all binomial coefficients are divisible by p except the first and the 
last. Hence in the finite field, all the coefficients are 0 except the first 
and the last, and we obtain the basic formula: 

For any elements x, ye A, we have 


By induction, we then see that for any positive integer m, we have 
(x + y) prn = x prn + yr 
Let K be the set of elements xe A such that 


t p — t. 



we see from the expression 



(x + y) p = x p + y p . 


x pn = x. 
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It is then easily verified that K is a field. Indeed, the above formula 
shows that this set is closed under addition. It is closed under multipli- 
cation, since 

(xyy n = x pn y pn 

in any commutative ring. If x / 0 and x pn = x, then we see at once that 

(x-') pn = x~ l . 

Let f(t) = t q — t. Then K contains all the roots of /, and is then 
obviously the smallest field containing F p and all roots of f Conse- 
quently, K is the splitting field of f 

The theory of unique factorization in Chapter IV, §3 applies to poly- 
nomials over any field, and in particular, the derivative criterion of 
Theorem 3.6 applies. Observe here how peculiar the derivative is. We 
have 


f'(t) = qt q l -1 * -1, 

because q = 0 in F p = Z/pZ. Consequently the polynomial f(t) has no 
multiple roots. Since it has at most q roots in A, we conclude that it has 
exactly q roots in A, Hence K has exactly q elements. This concludes 
the proof of Theorem 1.3. 


VIII, §1. EXERCISES 

1. Let F be a finite field with q elements. Let /(l)eF[f] be irreducible. 

(a) Prove that f(t) divides t q " — t if and only if deg / divides n. 

(b) Show that 

f-t = n n m 

d\n /rfirr 

where the product on the inside is over all irreducible polynomials of 
degree d with leading coefficient 1. 

(c) Let i p(d) be the number of irreducible polynomials over F of degree d. 
Show that 


•a = i dm. 

d | n 


(d) Let p be the Moebius function. Prove that 

#(n) = X lAd)ti nli . 

d | n 


Dividing by n yields an explicit formula for the number of irreducible 
polynomials of degree n, and leading coefficient 1 over F. 
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VIII, §2. THE FROBENIUS AUTOMORPHISM 

Theorem 2.1. Let F be a finite field with q elements. The mapping 

cp: xhx p 


is an automorphism of F. 

Proof. We already know that the map 

XKX P 

is a ring-homomorphism of F into itself. Its kernel is 0, and since F is 
finite, the map has to be a bijection, so an automorphism. 

The automorphism cp is called the Frobenius automorphism of F (rela- 
tive to the prime field). It generates a cyclic group, which is finite, 
because F has only a finite number of elements. Let q = p n . Note that 

cp n = id. 

Indeed, for any positive integer m, 

(p m x = x pTn 

for all x in F. Hence the period of cp divides n, because 

(p n X = x pn = x q = X. 

Theorem 2.2. The period of cp is exactly n. 

Proof Suppose the period is m < n. Then every element x of F sat- 
isfies the equation 


x pm - x = 0. 

But we have remarked in the preceding section that the polynomial 


t pm -t 


has at most p m roots. Hence we cannot have men, as desired. 

Theorem 2.3. Suppose F pm is a subfield of F pn . Then m\n. Conversely , 
if m\n then F pm is a subfield of F pn . 
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Proof. Let F be a subfield of F q , where q = p n . Then F contains the 
prime field F p so F has p m elements for some m. We view as a vector 
space over F, say of dimension d. Then after representing elements of 
as linear combinations of basis elements with coefficients in F we see 
that Fg has ( p m ) d = p md elements, whence n = md and m\n. 

Conversely, let m\n , n = md. Let F be the fixed field of cp m . Then F is 
the set of all elements xeF q such that 


By Theorem 1.2, this is precisely the field F pm . But 

< O ' = <P n , 

so F p m is fixed under (p n . Since F pn is the fixed field of c p n , it follows that 

F p „ cr F p „. 

This concludes the proof of the theorem. 

If n = md , we see that the order of c p m is precisely equal to d. Thus 
cp m generates a cyclic group of automorphisms of F 9 , whose fixed field is 
F pm . The order of this cyclic group is exactly equal to the degree 

d = [F p „ : F pm ], 

In the next section, we shall prove that the multiplicative group of F^ 
is cyclic. Using this, we now prove: 

Theorem 2.4. The only automorphisms of F^ are the powers of the Fro- 
benius automorphism 1, (p, . . . ,cp n ~ l . 


Proof. Let F^ = F p (ot) for some element a (this is what we are assum- 
ing now). Then a is a root of a polynomial of degree n, if q = p n . There- 
fore by Theorem 2.1 of Chapter VII there are at most n automorphisms 
of F q over the prime field F p . Since 1, cp, . . . ,cp n ~ l constitute n distinct 
automorphisms, there cannot be any others. This concludes the proof. 


The above theorems carry out the Galois theory for finite fields. We 
have obtained a bijection between subfields of F^ and subgroups of the 
group of automorphisms of F q9 each subfield being the fixed field of a 
subgroup. 
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VIII, §3. THE PRIMITIVE ELEMENTS 

Let F be a finite field with q = p n elements. In this section, we prove 
more than the fact that F = F p (a) for some a. 

Theorem 3.1. The multiplicative group F* of F is cyclic . 

Proof. In Theorem 1.10 of Chapter IV we gave a proof based on the 
structure theorem for abelian groups. Here we give a proof based on a 
similar idea but self contained. We start with a remark. Let A be a finite 
abelian group, written additively. Let a be an element of A, of period d, 
and let b be an element of period d! , with (d, d') = 1 . Then 

a 4- b 

has period dd'. The proof is easy, and is left as an exercise. 

Lemma 3.2. Let z be an element of A having maximal period , that is, 
whose period d is ^ the period of any other element of A. Then the 
period of any element of A divides d. 

Proof Suppose w has period m not dividing d. Let / be a prime 
number such that a power l k divides the period of w, but does not divide 
d. Write 

m = l k m', d = Fd\ 
where m', d! are not divisible by /. Let 

a = mw and b = l v z. 

Then a has period l k and b has period d! . Then 

a 4- b 

has period l k d' > d, a contradiction. 

We apply these remarks to the multiplicative group F*, having q — 1 
elements. Let a be an element of F* having maximal period d. Then a 
is a root of the polynomial 


t d ~ 1. 


By the lemma, all the powers 
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are distinct, and are roots of this polynomial. Hence they constitute all 
the distinct roots of the polynomial t d — 1. Suppose that a does not 
generate F*, so there is another element x in F* which is not a power of 
a. By the lemma, this element x is also a root of t d — 1. This implies 
that t d — 1 has more than d roots, a contradiction which proves the 
theorem. 

Example. Consider the prime field F p = Z/pZ itself. The theorem as- 
serts the existence of an element aeF* such that every element of F* is 
an integral power of a. In terms of congruences, this means that there 
exists an integer a such that every integer x prime to p satisfies a rela- 
tion 


x = a v mod p, 

for some positive integer v. Such integer a is called a primitive root mod p 
in the classical literature. 


VIII, §3. EXERCISES 

1. Find the smallest positive integer which is a primitive root mod p in each 
case: p = 3, p = 5, p = 7, p = 11, p = 13. 

2. Make a list of all the primes ^ 100 for which 2 is a primitive root. Do you 
think there are infinitely many such primes? The answer (yes) was conjectured 
by Artin, together with a density; cf. his collected works. 

3. If a is a cyclic generator of F*, where F is a finite field, show that ol p is also a 
cyclic generator. 

4. Let p be a prime ^ 3. An integer a ^ 0 mod p is called a quadratic residue 
mod p if there exists an integer x such that 

a = x 2 mod p. 

It is called a quadratic non-residue if there is no such integer x. Show that the 
number of quadratic residues is equal to the number of quadratic non- 
residues. [ Hint : Consider the map xi— ►x 2 on F*.] 


VIII, §4. SPLITTING FIELD AND ALGEBRAIC CLOSURE 

In Chapter VII, §7 we gave a general method for constructing an alge- 
braic closure. Here in the case of finite fields, we can express the proof 
more simply. Let g be polynomial of degree ^ 1 over the finite field F. 
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We have shown in Chapter VII, Theorem 3.1 how to construct a splitting 
field f or g. 

For each positive integer n, let K n be the splitting field of the poly- 
nomial 


tT-t 

over the prime field F p . By Theorem 3.2 of Chapter VII, given two split- 
ting fields E and E' of a polynomial f there is an isomorphism 

a: E' -> E 

leaving F fixed. In particular, if m\n, there is an embedding 

K m ^K„ 

because any root of t pm — t is also a root of t pK — t . If we then consider 
the union 


A=\JK„ 

for n = 1 , 2 ,..., then this union is easily shown to be algebraically closed. 
Indeed, let / be a polynomial of degree ^ 1 with coefficients in A. 
Then / has coefficients in some K my so / splits into factors of degree 1 
in K n for some n. This concludes the proof. 


VIII, §5. IRREDUCIBILITY OF THE CYCLOTOMIC 
POLYNOMIALS OVER Q 

In Chapter VII, §6 we considered an extension F(£), where £ is a primi- 
tive n - th root of unity. When F = Q is the rational numbers, the Galois 
group can be determined completely, and we shall now do so by using 
finite fields, although we use essentially nothing of what precedes in this 
chapter, only the flavor. 

Let o be an automorphism of F({) over F. Then there is some integer 
r(o) prime to n such that 


ffC = 

and this integer mod n is uniquely determined by a. Thus we get a map 
G F(0IF ->• (Z/nZ)* by er i-> r(a). 
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This map is injective. It is a homomorphism, because 

<x(t( 0) = = (<rO r,t) = C r(a)m , 

whence r{of) — r(o)r{r). In this way, we can view Gf{q/f as embedded in 
the multiplicative group (Z/nZ)*. 

So far, we have not used any special property of the rational numbers 
or the integers. We shall now do so. 

Let /(r)eZ[r] be a polynomial with integer coefficients. Let p be a 
prime number. Then we can reduce / mod p. If 

fit) = a n t n + • • • + a 0 

with ao, ... ,a n e Z, then we let its reduction mod p be 


fit) = “ n t n + ■■• + fl 0> 


where a { is the reduction mod p of a { . The map / 1 — ► / is a homomor- 
phism 

Z[t]^(Z/pZ)[t] = F p [t]. 

This is easily verified by using the definition of addition and multiplica- 
tion of polynomials as in Chapter IV, §5. 

Theorem 5.1. The map o\— >r(a) is an isomorphism 

^Q(0/Q (Z / hZ)* . 

Proof Let m be a positive integer prime to n. Then the map 
can be decomposed as a composite of maps 

where p ranges over primes dividing m. Thus it will suffice to prove: if p 
is a prime number and p )( n, and if f(t) is the irreducible polynomial of £ 
over Q, then £ p is also a root of f(t). Since the roots of t n — 1 are all the 
rc-th roots of unity, primitive or not, it follows that f{t) divides t n — 1, so 
there is a polynomial hit) e Q[r] with leading coefficient 1 such that 

t n - 1 = fit)hit). 

By the Gauss lemma, it follows that f h have integral coefficients. 

Suppose is not a root of f Then is a root of h, and £ itself is a 
root of hit p ). Hence fit) divides hit p \ and we can write 


hit p ) = fit)git). 
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Since f(t ) has integral coefficients and leading coefficient 1, we see 
that g has integral coefficients, again by the Gauss lemma. Since 
aP = a (mod p ) for any integer a y we conclude that 

h(t p ) = h(t) p (mod p), 

and hence 

h(t) p =f(t)g(t) (mod p). 

In particular, if we denote by / and h the polynomial over Z/pZ ob- 
tained by reducing / and h respectively mod p, we see that / and h are 
not relatively prime, i.e. have a factor in common. But t n — 1 = 
and hence t n — I has multiple roots. This is impossible, as one sees by 
taking the derivative, and our theorem is proved. 

As a consequence of Theorem 5.1, it follows that the cyclotomic 
polynomials of Chapter IV, §3, Exercise 13 are irreducible over Q. Prove 
this as an exercise. 


VIII, §5. EXERCISES 

1. Let F be a finite extension of the rationals. Show that there is only a finite 
number of roots of unity in F. 

2. (a) Determine which roots of unity lie in the following fields: Q(i'), Q(^/ — 2), 

Q(\ 2), Q( x ^3), Q( v 3), Q(v^5). 

(b) Let C be a primitive n - th root of unity. For which n is 

[Q(C):Q] =2? 

Prove your assertion, of course. 

3. Let F be a field of characteristic 0, and n an odd integer ^ 1. Let ( be a 
primitive n - th root of unity in F. Show that F also contains a primitive 2rc-th 
root of unity. 

4. Given a prime number p and a positive integer m, show that there exists a 
cyclic extension of Q of degree p m . [Hint: Use the exercises of Chapter II, 
§7.] 

5. Let n be a positive integer. Prove that there exist infinitely many cyclic 
extensions of Q of degree n which are independent. That is, if K U ...,K P are 
such extensions then 

K i n(K l K 2 '"K i _ 1 ) = Q for all i = 2,...,r. 

(For this exercise, you may assume that there exist infinitely many primes p 
such that p = 1 mod n. Again, use Chapter II, §7 and its exercises.) 
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6. Let G be a finite abelian group. Prove that there exists a Galois extension of 
Q whose Galois group is G. In fact, prove that there exist infinitely many 
such extensions. [You may use previous exercises.] 

7. Let a 3 = 2 and ft 5 = 7. Let y = a + /?. Prove that Q(a, /?) = Q(y) and that 
[Q(a,/?):Q] = 15. 

In the next two exercises, you will see a non-abelian linear group appearing 
as a Galois group. 

8. Describe the splitting field of t 5 — 7 over the rationals. What is its degree? 
Show that the Galois group is generated by two elements o , t satisfying the 
relation 

(T 5 = 1, t 4 = 1, tot -1 = a 2 . 


9. Let p be an odd prime and let a be a rational number which is not a p-th 
power in Q. Let K be the splitting field of t p — a over the rationals. 

(a) Prove that IK: Q] = p(p — 1). [Cf. Exercise 3 of §3.] 

(b) Let a be a root of t p — a. Let C be a primitive p-th root of unity. Let 
aeG KQ . Prove that there exists some integer b = b(( r), uniquely de- 
termined mod p, such that 


<r(a) = 


(c) Show that there exists some integer cl = d(i t) prime to p, uniquely 
determined mod p, such that 

<7(0 = c d . 

(d) Let G be the subgroup of GL 2 (Z/pZ) consisting of all matrices 

1 0 \ 

^ I with beZ/pZ and de(Z/pZ)*. 


Prove that the association 

a i — ► M{a) = 


1 0 
b(o) cl(o\ 


is an isomorphism of G KjQ with G. 

(e) Let r be a primitive root mod p, i.e. a positive integer prime to p which 
generates the cyclic group (Z/pZ)*. Show that there exist elements p, 
teG ki q which generate G^ /Q , and satisfy the relations: 

p p = 1 , T P 1 = 1 , TpT~ 1 = p r . 

(f) Let F be a subfield of K which is abelian over Q. Prove that F a Q(C). 

10. Let p, q be distinct odd primes. Let a ; b be rational numbers such that a is 
not a p-th power in Q and b is not a g-th power in Q. Let / (r) = t p — a and 
§(t) = t Q — b. Let be the splitting field of f(t) and K 2 the splitting field of 
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g(t). Prove that n K 2 = Q. It follows (from what?) that if K is the 
splitting field of f(t)g{t), then 

G k/ q ~ G K]/ q x G K2 jq. 

11. Generalize Exercise 10 as much as you can. 

12. Refer back to the exercises of Chapter VII, §7. Let p be a prime number. Let 
|x(p") denote the group of p n - th roots of unity. Let 


nOn = U rip")- 

«= i 


Let K n = Q(ji(/?")) be the splitting field of tf — 1 over the rationals, and let 


= U K n- 

n= 1 


Prove: 

Theorem. Let Z* be the projective limit of the groups (Z/p n Z)*, as in 
Chapter VII, §7, Exercise 2. There is an isomorphism 

Z*$Aut(KJQ), 

which can be obtained as follows. 

(a) Given aeZ* prove that there exists an automorphism o a e Aut(/v o0 /Q) 

having the following property. Let Choose n such that £eji(/?"). 

Let ue Z be such that u = a mod p n . Then o a £ = £ w . 

(b) Prove that the map a\-^o a is an injective homomorphism of Z* into 
Aut (KJQ). 

(c) Given o e Aut(/ i C oc /Q), prove that there exists aeZ* such that o = o a . 


VIII, §6. WHERE DOES IT ALL GO? OR RATHER, 

WHERE DOES SOME OF IT GO? 

You have now learned some facts about Galois groups, and I thought 
you should get some idea of the type of questions which remain un- 
solved. If you find the going too hard, then sleep on it. If you still find 
it too hard, then you can obviously skip this entire section without 
affecting your understanding of any other part of the book. I want this 
section to be stimulating, not scary. 

One fundamental question is whether given a finite group G, there ex- 
ists a Galois extension K of Q whose Galois group is G. This problem 
has been realized explicitly for at least a century. Emmy Noether 
thought of one possibility: construct a Galois extension of an extension 
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Q(u l ,...,u n ) with the given Galois group, and then specialize the param- 
eters U}, . . . ,u n to rational numbers. Here u l9 ...,u n are independent vari- 
ables. You have seen how one can construct an extension Q(x l5 .. .,x„) of 
Q(s s n ) where s u ..., s n are the elementary symmetric functions of 
x x n . The Galois group is the symmetric group S n . Let G be a sub- 
group of S„. It was an open question for a long time whether the fixed 
field of G can be written in the form Q (u x , . .,u„). Swan showed that this 
was impossible in general, even if G was a cyclic group ( Inventiones 
Math., 1969). 

In the nineteenth century already, number theorists realized the differ- 
ence between abelian and non-abelian extensions. Kronecker stated and 
gave what are today considered as incomplete arguments that every finite 
abelian extension of Q is contained in some extension Q (() where ( is a 
root of unity. Such extensions are called cyclotomic. A complete proof 
was given by Weber at the end of the nineteenth century. 

If F is a finite extension of Q, the situation is harder to describe, but 
I shall give one significant example exhibiting the flavor. The field F 
contains a subring R F , called the ring of algebraic integers in F, consist- 
ing of all elements aeF such that the irreducible polynomial of a with 
rational coefficients and leading coefficient 1 has in fact all its coefficients 
contained in the integers Z. It can be shown that the set of all such ele- 
ments is a ring R F , and that F is its quotient field. 

Let P be a prime ideal of R F . It is easy to show that P n Z = (p) is 

generated by a prime number p . Furthermore, R F /P is a finite field with 
q elements. Let K be a finite Galois extension of F. It can be shown 

that there exists a prime ideal Q of R K such that Q n R F = P. Further- 

more, there exists an element o Q e G = Gal(K/F) such that o q (Q) = Q 
and for all oieR k we have 


<j Q oc = cn q mod Q. 

We call C7 Q a Frobenius element in the Galois group G associated with Q. 
In fact, it can be shown that for all but a finite number of Q , two such 
elements are conjugate to each other in G. We denote any of them by 
o P . If G is abelian, then there is only one element o Q in the Galois 
group. 

Theorem. There exists a unique abelian extension K of F having the 
following property: If P x , P 2 are prime ideals of R F , then o Pi = o P2 if 
and only if there is an element a of K such that otP x = P 2 - 

In a similar but more complicated manner, one can characterize all 
abelian extensions of F. This theory is known as class field theory, devel- 
oped by Kronecker, Weber, Hilbert, Takagi, and Artin. The main state- 
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ment concerning the Frobenius automorphism is Artin’s Reciprocity Law. 
You can find an account of class field theory in books on algebraic 
number theory. Although some fundamental results are known, by no 
means all results are known. 

The non-abelian case is much more difficult. I shall indicate briefly 
one special case which gives some of the flavor of what’s going on. The 
problem is to do for non-abelian extensions what Artin did for abelian 
extensions in “class field theory”. Artin went as far as saying that the 
problem was not to give proofs but to formulate what was to be proved. 
The insight of Langlands and others in the sixties showed that actually 
Artin was mistaken: The problem lies in both. Shimura made several 
computations in this direction involving “modular forms”, see, for in- 
stance, A reciprocity law in non-solvable extensions , J. reine angew. Math. 
221, 1966. Langlands gave a number of conjectures, relating Galois 
groups with “automorphic forms”, which showed that the answer lay in 
deeper theories, whose formulations, let alone their proofs, were difficult. 
Great progress was made in the seventies by Serre and Deligne, who 
proved a first case of Langland’s conjectures, Annales Ecole Normale 
Superieure , 1974. 

The study of non-abelian Galois groups occurs via their linear “repre- 
sentations”. For instance let l be a prime number. We can ask whether 
GL W (F Z ), or GL 2 (F Z ), or PGL 2 (F Z ) occurs as a Galois group over Q, and 
“how”. The problem is to find natural objects on which a Galois group 
operates as a linear map, such that we get in a natural way an isomor- 
phism of this Galois group with one of the above linear groups. The 
theories which indicate in which direction to find such objects are much 
beyond the level of this course. Again I pick a special case to give the 
flavor. 

Let K be a finite Galois extension of the rational numbers, with 
Galois group G = Gal(K/Q). Let 

p:G-GL 2 (F,) 

be a homomorphism of G into the group of 2 x 2 matrices over the fin- 
ite field F z for some prime /. Such a homomorphism is called a represen- 
tation of G. Recall from elementary linear algebra that if 



is a 2 x 2 matrix, then its trace is defined to be the sum of the diagonal 
elements, that is 


tr M = a T d. 
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Thus we can take the trace and determinant 

tr p(cj) and det p(o\ 

which are elements of the field F, itself. 

Consider the infinite product 

A = A (z) = z f[(l- z ") 24 

n = 1 


OG 

= X a„z n . 

n — 1 

The coefficients a n are integers, and a x = 1. 

Theorem. For each prime l there exists a unique Galois extension K of 
Q, with Galois group G, and an injective homomorphism 

p\ G — > GL 2 (F,) 

having the following property : For all but a finite number of primes p , if 
a p is the coefficient of z p in A (z), then we have 

tr p(<j p ) = a p mod / and det p(<j p ) = p 11 mod /. 

Furthermore , for all but a finite number of primes l ( which can be ex- 
plicitly determined ), the image p(G) in GL 2 (F,) consists of those 
matrices Me GL 2 (F*) such that det M is an eleventh power in Ff. 

The above theorem was conjectured by Serre in 1968, Seminaire de 
Theorie des Nombres, Delange Pisot-Poitou. A proof of the existence as 
in the first statement was given by Deligne, Seminaire Bourbaki, 
1968-1969. The second statement, describing how big the Galois group 
actually is in the group of matrices GL 2 (F,) is due to Serre and Swinner- 
ton-Dyer, Bourbaki Seminar, 1972, see also Swinnerton-Dyer’s article in 
Springer Lecture Notes 350 on “Modular Functions of One Variable 
HI”. 

Of course, the product and series for A(z) have been pulled out of no- 
where. To explain the somewhere which makes such product and series 
natural would take another book. 


Still another type of question about G Q 

The above theorems involve the arithmetic of prime ideals and Frobenius 
automorphisms. There is still another possibility for describing Galois 
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groups. Let F be a field, F a an algebraic closure, and let 

G f = Gal (F a /F) 

be the group of automorphisms of its algebraic closure, leaving F fixed. 
The question is what does G Q look like? 

Artin showed that the only elements of finite order in G Q are complex 
conjugation (for any imbedding of Q a in C), and all conjugates of com- 
plex conjugation in G Q ( Abhandlung Math. Seminar Hamburg , 1924). 

The structure of G Q is complicated. But I shall now develop some 
notions leading to a conjecture of Shafarevich which gives some idea of 
one possible formulation for part of the answer. 

Let G be any group. Let be the family of all subgroups of finite 
index. Let {x„} be a sequence in G. We say that {x„} is a Cauchy se- 
quence if given a subgroup He^ there exists n 0 such that for m, n 0 
we have x n x~ l eH. A sequence is said to be null if given He S' there 
exists n 0 such that for all n 0 we have x n e H. As an exercise (see 
Exercise 4 of Chapter VII, §7), show that the Cauchy sequences form a 
group, the null sequences form a normal subgroup. The factor group is 
called the completion of G with respect to the subgroups of finite index, 
and this completion is denoted by G. 

Let X = {x v x 2 ,...} be a denumerable sequence of symbols. It can be 
shown that there is a group G, called the free group on X , having the 
following property. Every element of G can be written in the form 

Y ntl yftlr 

■*il ' ir ’ 

where m 1 ,...,m r are integers, x il9 ... 9 x ir are elements of the sequence X , 
and ij^i j+l for j = 1, ...,r — 1. Furthermore, such an element is equal 
to 1 if and only if = • • • = m r = 0. We also call G the free group on a 
countable set of symbols. 

The following conjecture is due to Shafarevich, following work of Iwa- 
sawa (Annals of Mathematics, 1953) and Shafarevich ( Izvestia Akademia 
Nauk , 1954). 

Conjecture. Let F 0 be the union of all fields Q(Q, where £ ranges over 
all roots of unity. Let F be a finite extension of F 0 . Then G F is 
isomorphic to the completion G, where G is the free group on a count- 
able set of symbols. 

For an excellent historical account and other matters, see Matzat’s 
paper, Jahresbericht Deutsche Math. Vereinigung , 1986-1987. 


CHAPTER IX 


The Real and Complex 
Numbers 


IX, §1. ORDERING OF RINGS 

Let K be an integral ring. By an ordering of R one means a subset P of 
P satisfying the following conditions: 

ORD 1. For every xe R we have x e P, or x = 0, or — xe P, and these 
three possibilities are mutually exclusive. 

ORD 2. If x, yeP then x + yeP and xy e P. 

We also say that R is ordered by P, and call P the set of positive ele- 
ments. 

Let us assume that R is ordered by P. Since 1 ^ 0, and 1 = l 2 = (— l) 2 
we see that 1 is an element of P, i.e. 1 is positive. By ORD 2 and induc- 
tion, it follows that 1 + • • • + 1 (sum taken n times) is positive. An ele- 
ment x e R such that x ^ 0 and x£P is called negative. If x, y are 
negative elements of P, then xy is positive (because -jcgP, — yeP, and 
hence ( — x)( — y) = xy e P). If x is positive and y is negative, then xy is 
negative, because — y is positive, and hence x( — y) = —xy is positive. 
For any xeR, x ^ 0, we see that x 2 is positive. 

Suppose that R is a field. If x is positive and x # 0 then xx _1 = 1, 
and hence by the preceding remarks, it follows that x -1 is also positive. 

Let R be an arbitrary ordered integral ring again, and let P' be a sub- 
ring. Let P be the set of positive elements in R , and let P' = Pn R f . 
Then it is clear that F defines an ordering on R ', which is called the 
induced ordering. 

More generally, let R' and R be ordered rings, and let P', P be their 
sets of positive elements respectively. Let f:R'->R be an embedding 
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(i.e. an injective homomorphism). We shall say that / is order-preserving 
if for every xeR' such that x e P' we have / (x) e P. This is equivalent 
to saying that f ~ l (P)=P' [where f~ 1 (P) is the set of all xeR' such 
that f(x) eP]. 

Let x, ye R. We define x < y (or y > x) to mean that y — xeP. Thus 
to say that x > 0 is equivalent to saying that x e P; and to say that 
x < 0 is equivalent to saying that x is negative, or — x is positive. One 
verifies easily the usual relations for inequalities, namely for x, y, zeR: 

INI. x < y and y < z implies x < z. 

IN 2. x < y and z > 0 implies xz < yz. 

IN 3. x < y implies x + z < y + z. 

If R is a field, then 

IN 4. x < y and x, y > 0 implies 1/y < 1/x. 

As an example, we shall prove IN 2. We have y — xeP and zeP, so 
that by ORD 2, (y — x)zeP. But (y ~ x)z = yz — xz, so that by defini- 
tion, xz < yz. As another example, to prove IN 4, we multiply the in- 
equality x<y by x" 1 and y _1 to find the assertion of IN 4. The others 
are left as exercises. 

If x, yeR we define x^y to mean that x < y or x = y. Then one 
verifies at once that IN 1, 2, 3 hold if we replace throughout the < sign 
by Furthermore, one also verifies at once that if x ^ y and y^x 
then x = y. 

In the next theorem, we see how an ordering on an integral ring can 
be extended to an ordering of its quotient field. 

Theorem 1.1. Let R be an integral ring, ordered by P. Let K be its 

quotient field . Let P K be the set of elements of K which can be written 

in the form a/b with a, beR, b> 0 and a > 0. Then P K defines an 

ordering on K extending P. 

Proof Let xe K, x # 0. Multiplying a numerator and denominator of 
x by — 1 if necessary, we can write x in the form x = a/b with a, beR 
and b > 0. If a > 0 then xeP K . If — a > 0 then — x = — a/beP K . We 
cannot have both x and — xeP K , for otherwise, we could write 

x = a/b and — x = c/d 
with a, b, c, de R and a, b, c, d > 0. Then 


— a/b = c/d. 
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whence —ad = bc. But bceP and adeP , a contradiction. This proves 
that P K satisfies ORD 1. Next, let x, yeP K and write 


with a , b , c, deR and a , b, c, d > 0. Then xy = ac/bdeP K . Also 


lies in P K . This proves that P K satisfies ORD 2. If a e R, u > 0, then 
a = a / 1 e P A so Pc P A . This proves our theorem. 

Theorem 1.1 shows in particular how one extends the usual ordering 
on the ring of integers Z to the field of rational numbers Q. How one 
defines the integers, and the ordering on them, will be discussed in an 
appendix. 

Let R be an ordered ring as before. If xeR, we define 


We then have the following characterization of the function x i — ► | x|, 
which is called the absolute value: 

For every xeR , |x| is the unique element zeR such that z^O and 

z 2 = x 2 . 

To prove this, observe first that certainly |x| 2 = x 2 , and |x| ^ 0 for all 
xeR. On the other hand, given aeP, a > 0 there exist at most two ele- 
ments zeR such that z 2 = a because the polynomial t 2 — a has at most 
two roots. If w 2 = a then w^O and ( — w) 2 = w 2 = a also. Hence there 
is at most one positive element zeR such that z 2 = a. This proves our 
assertion. 

We define the symbol y/a for a ^ 0 in R to be the element z ^ 0 in R 
such that z 2 = a, if such z exists. Otherwise, J a is not defined. It 
is now easy to see that if a, b ^ 0 and yj a, yjb exist, then ^/ab exists and 


Indeed, if z, w ^ 0 and z 2 = a, w 2 = b , then (zw) 2 = z 2 w 2 = ab. Thus we 
may express the definition of the absolute value by means of the expres- 
sion | x | = y/x . 


x = a/b and y = c/d 


ad be 




= sj a \' b - 
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The absolute value satisfies the following rules: 

AV 1. For all xeR, we have |x| ^ 0, and |x| > 0 if x ^ 0. 

AV 2. \xy\ = \x\\y\ for all x, y g R. 

AV 3. |x T y \ ^ | x | + \y\ for all x, yeR. 

The first one is obvious. As to AV 2, we have 

\xy\ = -J(xy) 2 = Jx 2 y 2 = s/x 2 Jy 2 = |x||y|. 

For AV 3, we have 

|x + y ]1 = (x + y) 2 = x 2 + xy + xy + y 2 
^ |x| 2 + 2\xy\ + yl 2 
= |x| 2 + 2|x| |y| + |yl 2 
= (|x| + |y|) 2 . 

Taking square roots yields what we want. (We have used implicitly two 
properties of inequalities, cf. Exercise 1.) 

IX, §1. EXERCISES 

1. Let R be an ordered integral ring. 

(a) Prove that x ^ |x| for all xeR. 

(b) If «, b ^ 0 and * ^ b, and if v /n, v /& exist, show that yfa ^ v /&. 

2. Let K be an ordered field. Let P be the set of polynomials 

f{t) = * n t n + ■■■ + « 0 

over K y with > 0. Show that P defines an ordering on K[t~\. 

3. Let R be an ordered integral ring. If x, yeR , prove that | — x| = |x|, 

\x - y\ ^ |X| - \y\, 

and also 

|x + y\ ^ |x| - \y\. 

Also prove that |x| ^ |x + y\ + \y\- 

4. Let K be an ordered field and /: Q -► K the embedding of the rational 
numbers into K. Show that f is necessarily order preserving. 
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IX, §2. PRELIMINARIES 

Let K be an ordered field. From Exercise 4 of the preceding section, 
we know that the embedding of Q in K is order preserving. We shall 
identify Q as a subfield of K. 

We recall formally a definition. Let S be a set. A sequence of 
elements of S is simply a mapping 

Z + ->S 


from the positive integers into S. One usually denotes a sequence with 
the notation 


or 


or simply 


X n)n^ 1 


if there is no danger of confusing this with the set consisting of the single 
element x w . 

A sequence {x n } in K is said to be a Cauchy sequence if given an 
element e > 0 in K, there exists a positive integer N such that for all 
integers m, n^N we have 


\x n -x m \^e. 

(For simplicity, we agree to let N, n , m denote positive integers unless 
otherwise specified. We also agree that c denotes elements of K.) 

To avoid the use of excessively many symbols, we shall say that a cer- 
tain statement S concerning positive integers holds for all sufficiently 
large integers if there exists N such that the statement S(n) holds for all 
N. It is clear that if S l ,...,S r is a finite number of statements, each 
holding for all sufficiently large integers, then they are valid simul- 
taneously for all sufficiently large integers. Indeed, if 

S^n) is valid for n ^ N u ...,S r (ri) is valid for n ^ N r , 

we let N be the maximum of N i ,...,N r and see that each S t {n) is valid 
for n ^ N. 

We shall say that a statement holds for arbitrarily large integers if 
given N 9 the statement holds for some n ^ N. 
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A sequence {x n } in K is said to converge if there exists an element 
xeK such that, given c > 0 we have 

\x - X n \ g t 


for all sufficiently large n. 

An ordered field in which every Cauchy sequence converges is said 
to be complete. We observe that the number x above, if it exists, is 
uniquely determined, for if yeK is such that 


\y-x„\£e 


for all sufficiently large n, then 


I* - y\ g |x - x n + x n - y\ ^ I* - x„| + |x„ -y | g 2c. 


This is true for every c > 0 in K, and it follows that x — y — 0, that is 
x = y. We call this number x the limit of the sequence {x„}. 

An ordered field K will be said to be archimedean if given xeK there 
exists a positive integer n such that x ^ n. It then follows that given c > 

0 in K , we can find an integer m > 0 such that l/c < m, whence l/m < c. 

It is easy to see that the field of rational numbers is not complete. 
For instance, one can construct Cauchy sequences of rationals whose 
square approaches 2, but such that the sequence has no limit in Q 
(otherwise, yjl would be rational). In the next section, we shall construct 
an archimedean complete field, which will be called the real numbers. 
Here, we prove one property of such fields, which is taken as the start- 
ing point of analysis. 

Let S be a subset of K. By an upper bound for S one means an 
element zeK such that x ^ z for all xeS. By a least upper bound of S 
one means an element weK such that w is an upper bound, and such 
that, if z is an upper bound, then w ^ z. If w l5 w 2 are least upper 

bounds of S, then vv x ^ w 2 and w 2 ^ w l so that w x = w 2 : A least 

upper bound is uniquely determined. 

Theorem 2.1. Let K be a complete archimedean ordered field. Then 
every non-empty subset S of K which has an upper bound also has a 
least upper bound. 

Proof. For each positive integer n we consider the set T n consisting of 
all integers y such that for all xeS, we have nx ^ y (and consequently, 
x ^ y/n). Then T n is bounded from below by any element nx (with xeS), 
and is not empty because if b is an upper bound for S, then any integer 
y such that nb ^ y will be in T n . (We use the archimedean property.) 
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Let y n be the smallest element of T n . Then there exists an element x n of 
S such that 


y„ - l <nx„^ y n 

(otherwise, y n is not the smallest element of T n ). Hence 


n 


- < X < • ) n . 
n = 

n n 


Let z n = yjn . We contend that the sequence {z n } is Cauchy. To prove 
this, let m, n be positive integers and say yjn ^ yjm. We contend that 

m m n “ m 

Otherwise 

yn < y r n_l 
n ~ m m 

and 

m m 


is an upper bound for S , which is not true because x m is bigger. This 
proves our contention, from which we see that 

yn_ym <1 
n m = m 

For m, n sufficiently large, this is arbitrarily small, and we have proved 
that our sequence {z n } is Cauchy. 

Let w be its limit. We first prove that w is an upper bound for S. 
Suppose there exists xeS such that w < x. There exists an n such that 


Then 


\z 


n 


w| S 


x — w 


2 


x — z n = x — w + w- Z„^x — w — | w — z n \ 


> x - 


w — 


x — w 


2 


> 


x — w 


2 


> 0 , 


so x > z n , contradicting the fact that z n is an upper bound for S. 
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We now show that w is a least upper bound for S. Let u < w. There 
exists some n such that 


.1 vv - u 
\*n-Xn\*- <^— 

n 4 


(Just select n sufficiently large.) We can also select n sufficiently large so 
that 

xv - u 

\ Z n VV | ^ ~ 


since w is the limit of {z„}. Now 


x n — u = xv — u + x n — z n + z n — w 

^ w - u - |x„ - zj - I z„ - vv | 
w — u xv — u 


> XV — u 


4 4 


xv — u 


whence u < x n . Hence u is not an upper bound. This proves that w is 
the least upper bound, and concludes the proof of the theorem. 


IX, §3. CONSTRUCTION OF THE REAL NUMBERS 

We start with the rational numbers Q and their ordering obtained from 
the ordering of the integers Z as in Theorem 1.1 of §1. We wish to de- 
fine the real numbers. In elementary school, one uses the real numbers 
as infinite decimals, like 


= 1.414 

Such an infinite decimal is nothing but a sequence of rational numbers, 
namely 


1, 1.4, 1.41, 1.414,... 

and it should be noted that there exist other sequences which “ap- 
proach” yjl. If one wishes to define yjl, it is then reasonable to take as 
definition an equivalence class of sequences of rational numbers, under a 
suitable concept of equivalence. We shall do this for all real numbers. 
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We start with our ordered field Q and Cauchy sequences in Q. Let 
y = {c n } be a sequence of rational numbers. We say that y is a null se- 
quence if given a rational number e > 0 we have 

k„l ^ e 

for all sufficiently large n. Unless otherwise specified we deal with ra- 
tional c in what follows, and our sequences are sequences of rational 
numbers. 

If a = {a n } and = {b n } are sequences of rational numbers, we define 
a + /? to be the sequence {a n + b n }, i.e. the sequence whose n - th term is 
a n + b n . We define the product a/? to be the sequence whose n-th term is 
a n b n . Thus the set of sequences of rational numbers is nothing but the 
ring of all mappings of Z + into Q. We shall see in a moment that the 
Cauchy sequences form a subring. 

Lemma 3.1. Let a = {a n } be a Cauchy sequence. There exists a posi- 
tive rational number B such that \a n \ ^ B for all n. 


Proof Given 1 there exists N such that for all n ^ N we have 

\ a n ~ a N\ ^ I- 

Then for all n^ N, 

\a n \ S M + 1. 

We let B be the maximum of \a x \, |0 2 |,...,|a N _ 1 |, \a N \ + 1. 

Lemma 3.2. The Cauchy sequences form a commutative ring. 

Proof Let a = {a n } and f = { b n } be Cauchy sequences. Given 
c > 0, we have 



for all m , n sufficiently large, and also 


I K~ 



for all m, n sufficiently large. Hence for all m, n sufficiently large, we 
have 


k, + b„- («m + bj\ = k — a m + b n — bj 

^ \<*n-a m \ + \K~b m \ 

, £ e 

= 2 + 2 ~e- 
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Hence the sum a + /} is a Cauchy sequence. One sees at once that 

-a = {-«„} 

is a Cauchy sequence. As for the product, we have 

I a n^n ~ a m^m\ = \ a nK d n b m + d n b m Cl m b m \ 

< WnWK ~ h m\ + Wn ~ d m \\bj. 

By Lemma 3.1, there exists B x > 0 such that \a n \^B i for all n, and 
B 2 > 0 such that \b n \ ^ B 2 for all n. Let B = ma x(B l9 B 2 ). For all m, n 
sufficiently large, we have 

\ a n — a m\ = ^ and \b n -b m \^~> 

and consequently 

I ^ n I ~ 2 . ^ 

So the product a/? is a Cauchy sequence. It is clear that the sequence 
e = {l,l,l,...} is a Cauchy sequence. Hence Cauchy sequences form a 
ring, and a subring of the ring of all mappings of Z + into Q. This ring 
is obviously commutative. 

Lemma 3.3. The null sequences form an ideal in the ring of Cauchy se- 
quences. 

Proof Let f = {b n } and y = {c n } be null sequences. Given e > 0, for 
all n sufficiently large we have 

^ and I 

Hence for all n sufficiently large, we have 

\b„ + c n \ ^ c 

so f -h y is a null sequence. It is clear that — f is a null sequence. 

By Lemma 3.1, given a Cauchy sequence a = {a n }, there exists a ra- 
tional number B > 0 such that \a n \ ^ B for all n. For all n sufficiently 
large, we have 



whence 


\a n b„\ ^ = €, 
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so that ctf is a null sequence. This proves that null sequences form an 
ideal, as desired. 

Let R be the ring of Cauchy sequences and M the ideal of null se- 
quences. We then have the notion of congruence, that is if a, feR, 
we had defined a = /? (mod M) to mean a — /? e M, or in other words 
a = p 4 - y for some null sequence y. We define a real number to be a 
congruence class of Cauchy sequences. As we know from constructing 
arbitrary factor rings, the set of such congruence classes is itself a ring, 
denoted by R/M , but which we shall also denote by R. The congruence 
class of the sequence a will be denoted by a for the moment. Then by 
definition, 

a + P = a + j5, a/? = a/5. 

The unit element of R is the class of the Cauchy sequence (1, I, 1, ...}. 

Theorem 3.4. The ring R/M = R of real numbers is in fact a field . 

Proof. We must prove that if a is a Cauchy sequence, and is not a 
null sequence, then there exists a Cauchy sequence f such that oe/? = e 
(mod M), where e = {I, 1, 1,...}. We need a lemma on null sequences. 

Lemma 3.5. Let a be a Cauchy sequence , and not a null sequence. 
Then there exist No and a rational number c > 0 such that \a„\ ^ c for 
*11 n ^ N 0 . 

Proof. Suppose otherwise. Let a = {a n }. Then given c > 0, there exists 
an infinite sequence n l <n 2 < **• of positive integers such that 

Kl<^ 

for each i= 1,2,.... By definition, there exists N such that for 
m, ^ IV we have 



Let n ( ^ N. We have for m ^ N, 


2e 

\a m \ ^ | a m - a„.\ + |aj ^ y> 


I aJ ^ l«J + 3 = £ - 


and for m, n ^ N, 


[IX, §3] 


CONSTRUCTION OF THE REAL NUMBERS 


337 


This shows that a is a null sequence, contrary to hypothesis, and proves 
our lemma. 

We return to the proof of the theorem. By Lemma 3.5, there exists 
N 0 such that for n ^ N 0 we have a n ^ 0. Let P = {b n } be the sequence 
such that b n = 1 if n < N 0 , and b n = a~ l if n ^ N 0 . Then /fa differs from 
e only in a finite number of terms, and so /fa — e is certainly a null 
sequence. There remains to prove that ft is a Cauchy sequence. 
By Lemma 3.5, we can select N 0 such that for all n ^ N 0 we have 
\a n \ ^ c > 0. It follows that 

1 1 

^ — 

\a n \ c 

Given c > 0, there exists N (which we can take ^ N 0 ) such that for all 
m, N we have 

K - «ml ^ ec2 - 


Then for m, n ^ N we get 


1 1 


a m a n 

=p 1 

1 

r 1 


a m a n 


thereby proving that is a Cauchy sequence, and concluding the proof 
of our theorem. 

We have constructed the field of real numbers. 

Observe that we have a natural ring-homomorphism of Q into R, ob- 
tained by mapping each rational number a on the class of the Cauchy 
sequence {a, a, a,...}. This is a composite of two homomorphisms, first 
the map 

a i— > { a , a, a ,.. .} 

of Q into the ring of Cauchy sequences, followed by the map 

R - R/M. 

Since this is not the zero homomorphism, it follows that it is an 
isomorphism of Q onto its image. 

The next lemma is designed for the purpose of defining an ordering on 
the real numbers. 

Lemma 3.6. Let a = {a n } be a Cauchy sequence. Exactly one of the fol- 
lowing possibilities holds : 

(1) a is a null sequence. 
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(2) There exists a rational c > 0 such that for all n sufficiently large , 
a n ^ c. 

(3) There exists a rational c < 0 such that for all n sufficiently large , 
a n ^ c. 

Proof It is clear that if a satisfies one of the three possibilities, then it 
cannot satisfy any other, i.e. the possibilities are mutually exclusive. 
What we must show is that at least one of the possibilities holds. Sup- 
pose that a is not a null sequence. By Lemma 3.5 there exists N 0 and a 
rational number c > 0 such that \a n \ ^ c for all n^ N 0 . Thus a n ^ c if 
a n is positive, and —a n ^c if a n is negative. Suppose that there exist 
arbitrarily large integers n such that a n is positive, and arbitrarily large 
integers m such that a m is negative. Then for such m, n we have 

a n -a m ^2c> 0, 

thereby contradicting the fact that a is a Cauchy sequence. This proves 
that (2) or (3) must hold, and concludes the proof of the lemma. 

Lemma 3.7. Let a = {a n } be a Cauchy sequence and let f = {b n } be a 
null sequence. If a satisfies property (2) of Lemma 3.6 then a + f also 
satisfies this property , and if a satisfies property (3) of Lemma 3.6, then 
a + f also satisfies property (3). 

Proof Suppose that a satisfies property (2). For all n sufficiently 
large, by definition of a null sequence, we have \b n \ ^ c/2. Hence for 
sufficiently large n, 

a n + K ^ |flj - \b n \ ^ c/2. 

A similar argument proves the analogue for property (3). This proves 
the lemma. 

We may now define an ordering on the real numbers. We let P be 
the set of real numbers which can be represented by a Cauchy sequence 
a having property (2), and prove that P defines an ordering. 

Let a be a Cauchy sequence representing a real number. If a is not 
null and does not satisfy (2), then —a obviously satisfies (2). By Lemma 
3.7, every Cauchy sequence representing the same real number as a also 
satisfies (2). Hence P satisfies condition ORD 1. 

Let a = {a n } and f = {b n } be Cauchy sequences representing real 
numbers in P, and satisfying (2). There exists c : > 0 such that a n ^ c l 
for all sufficiently large n, and there exists c 2 > 0 such that b n ^ c 2 for all 
sufficiently large n. Hence a n + b n ^ c x + c 2 > 0 for sufficiently large n , 
thereby proving that a + /? is also in P. Furthermore, 

a K b„ ^ C t c 2 > 0 
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for all sufficiently large n, so that 2 /? is in P. This proves that P defines 
an ordering on the real numbers. 

We recall that we had obtained an isomorphism of Q onto a subfield 
of R, given by the map 

a 1 — ► {a, a, a,.. 

In view of Exercise 4, §1 this map is order preserving, but this is also 
easily seen directly from our definitions. For a while, we shall not 
identify a with its image in R, and we denote by a the class of the 
Cauchy sequence {a, a, a,...}. 

Theorem 3.8. The ordering of R is archimedean. 

Proof. Let a be a real number, represented by a Cauchy sequence 
a = {a n }. By Lemma 3.1, we can find a rational number r such that 
a n = r for a ll w, and multiplying r by a positive denominator, we see that 
there exists an integer b such that a n ^b for all n. Then b — a is repre- 
sented by the sequence {b — a n } and b — a n ^ 0 for all n. By defini- 
tion, it follows that 


b — a ^ 0, 

whence a ^ h, as desired. 

The following lemma gives us a criterion for inequalities between real 
numbers in terms of Cauchy sequences. 

Lemma 3.9. Let y = {c„} be a Cauchy sequence of rational numbers , 
and let c be a rational number >0. If \c n \ ^ c for all n sufficiently 
large , then \y\ ^ c. 

Proof If y = 0, our assertion is trivial. Suppose y ^ 0, and say y > 0. 
Then \y\ = y, and thus we must show that c — y ^ 0. But for all n suffi- 
ciently large, we have 


c — Cn = 0. 


Since c — y = {c — c n }, it follows from our definition of the ordering in R 
that c — y ^ 0. The case when y < 0 is proved by considering — y. 

Given a real number e > 0, by Theorem 3.8 there exists a rational 
number ei > 0 such that 0 < e\ < e. Hence in the definition of limit, 
when we are given the e, it does not matter whether we take it real or 
rational. 
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Lemma 3.10. Let cl = {a n } be a Cauchy sequence of rational numbers. 
Then a is the limit of the sequence {a^}. 

Proof Given a rational number e > 0, there exists N such that, for m, 
^ iV we have 

\a n - a m \ 

Then for all m^N we have by Lemma 3.9, for all n^c N, 

i«-«j = i {«»-«*} i 

This proves our assertion. 

Theorem 3.11. The field of real numbers is complete. 

Proof Let {A n } be a Cauchy sequence of real numbers. For each n, 
by Lemma 3.10, we can find a rational number a n such that 

\ A « - «j ^ -• 

n 

(Strictly speaking, we still should write \/h on the right-hand side!) 
Furthermore, by definition, given c > 0 there exists N such that for all m, 
n^ N we have 


\A n A m \ 



Let N l be an integer ^ N, and such that \/N l < c/3. Then for all m, 
n^ N l we get 


\a n a m | I a n A n + A n A m + A m a m \ 

= \&„ ~ A n \ + \ A n — A m | + | A m — a m \ 


e e e 

3 3 3 


£. 


This proves that {a n } is a Cauchy sequence of rational numbers. Let A 
be its limit. For all n , we have 


\A n ~ A\ ^ | A„ - d n \ + | d n - A |. 


If we take n sufficiently large, we see that A is also the limit of the 
sequence {A n }, thereby proving our theorem. 
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The procedure we have followed to construct a complete field from the 
rational numbers can be generalized in many contexts, and occurs often 
in mathematics. The exercises will show you some number-theoretic 
contexts, as in the construction of p-adic fields for prime numbers p. But 
in analysis, the construction is also applied to vector spaces, not 
necessarily fields. For instance, let V be the real vector space of all 
continuous functions on R, vanishing outside some bounded interval. On 
V we can define a norm as follows. Let / e V. We define 

ii/ii, = r i /mi dx . 

J — 00 

This norm satisfies properties analogous to AV 1, AV 2 and AV 3, 
namely: 

N 1. Let f e V. Then \\fh ^ 0, and =0 if and only if f = 0. 

N 2. If ce R and feV, then |l c/||i = |c| ||/!|i. 

N 3. If f, <?eR, then \\f + g\h ^ ll/llj + Mi- 

One can then define Cauchy sequences, null sequences, and one can 
construct the factor space of Cauchy sequences modulo null sequences, to 
obtain a vector space V. The norm can be extended to V, and one can 
then prove the theorem that V is complete. In analysis, one analyzes this 
completion, which is in some sense the largest vector space of functions 
whose absolute value is integrable over R. 

In the above context of vector spaces, there is no question of ordering 
the elements of V , so the whole part of the construction of the real 
numbers having to do with ordering simply drops out of consideration. 
Similarly, in the exercises having to do with completions, there will be 
no ordering properties. The completions will be constructed only with 
the ring of Cauchy sequences and the maximal ideal of null sequences. 


IX, §3. EXERCISES 

1. Prove that every positive real number has a positive square root in R. Since 
the polynomial t 2 — * has at most two roots in a field, and since for any root 
a, the number —a is also a root, it follows that for every *eR, * ^ 0, there 
exists a unique aeR, a ^ 0 such that a 2 = *. [Hint: For the above proof, let 
a be the least upper bound of the set of rational numbers b such that b 2 ^ *.] 

2. Show that every automorphism of the real numbers is the identity. [Hint: 
Show first that an automorphism is order preserving.] 
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3. Let p be a prime number. If x is a non-zero rational number, written in the 
form x = p r a/b where r is an integer, a , b are integers not divisible by p, we 
define 


i*i P = i /p r - 

Define |0| p = 0. Show that for all rational x, y we have 

l*J'lp = |x|p|j'lp and |x + >'| p ^|x| p + |y| p . 

In fact, prove the stronger property 

I* + y\ P S m ax(|x| p , |y| p ). 

4. Let F be a field. By an absolute value on F we mean a real-valued function 
xi— ► | x | satisfying the following properties: 

AV 1. We have |x| ^ 0, and = 0 if and only if x = 0. 

AV 2. For all x, yeF we have \xy \ =|x||y|. 

AV 3. |x + y\ ^ |x| + I y|- 

(a) Define the notion of Cauchy sequence and null sequence with respect to an 
absolute value d on a field F. Define what it means for a field to be 
complete with respect to v. 

(b) For an absolute value as above, prove that the Cauchy sequences form a 
ring, the null sequences form a maximal ideal, and the residue class ring is 
a field. Show that the absolute value can be extended to this field, and 
that this field is complete. 

(c) Let F a E be a subfield, and suppose E has an absolute value extending an 
absolute value on F. We define F to be dense in E if given s > 0, and an 
element aeF, there exists aeF such that |a — a\ < e. Prove: 

There exists afield E which contains F as a subfield , such that E has an 
absolute value extending the absolute value on F, such that F is dense in 
E, and E is complete. 

Such a field E is called a completion of F. 

(d) Prove the uniqueness of a completion, in the following sense. Let F, E' be 
completions of F. Then there exists an isomorphism 

a: F -► F' 

which restricts to the identity on F and such that <r preserves the absolute 
value, that is for all aeF we have 

\aa\ = |a|. 

The completion of a field F with respect to an absolute value v is 
usually denoted by F v . 
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5. An absolute value is called non-archimedean if instead of AV 3 it satisfies the 
stronger property 


\x + y\ ^ max(|x|, |y|). 

The function | | p on Q is called the p- adic absolute value, and is non- 
archimedean. Suppose | | is a non-archimedean absolute value on a field F. 
Prove that given xeF, x # 0, there exists a positive number r such that if 
|y - x | < r then \y\ = |x|. 

6. Let | | be a non-archimedean absolute value on a field F. Let R be the subset 
of elements xeF such that |x| ^ 1. 

(a) Show that R is a ring, and that for every xeF we have xeR or x _1 e R. 

(b) Let M be the subset of elements xe R such that lx| < 1. Show that M is a 
maximal ideal. 


IX, §4. DECIMAL EXPANSIONS 

Theorem 4.1. Let d be an integer ^ 2, and let m be an integer ^ 0. 
Then m can be written in a unique way in the form 

(1) m = a 0 + a^d + *•• + a n d n 

with integers a { such that 0 ^ a t < d. 

Proof This is easily seen from the Euclidean algorithm, and we shall 
give the proof. For the existence, if m < d, we let a 0 = m and a t = 0 for 
i > 0. If m ^ d we write 


m = qd + a 0 

with 0 ^ a 0 < d, using the Euclidean algorithm. Then q < m, and by 
induction, there exist integers a t (0 ^ a t < d and i ^ 1 ) such that 


q — Uj + # 2 ^ + * * * + a k d k . 


Substituting this value for q yields what we want. As for uniqueness, 
suppose that 

(2) m = b 0 + b x d + + b n d n 

with integers b t satisfying 0 ^ b t < d. (We can use the same n simply by 
adding terms with coefficients b t = 0 or a { = 0 if necessary.) Say a 0 ^ b 0 . 
Then b 0 — a 0 ^ 0, and b 0 — a 0 < d. On the other hand, b 0 — a 0 = de for 
some integer e [as one sees subtracting (2) from (1)]. Hence b 0 — a 0 = 0 
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and b 0 = a 0 . Assume by induction that we have shown a { = b t for 
0 ^ i ^ s and some s < n. Then 

a s+ id s + * + * * * + a n d n = b s + id s + 1 + + b n d n . 

Dividing both sides by d s+1 , we obtain 

a s+1 + ... + a/'- 1 = b s+l + ••. + 

By what we have just seen, it follows that a s + 1 = fr s+1 , and thus we have 
proved uniqueness by induction, as desired. 

Let x be a positive real number, and d an integer ^ 2. Then x has a 
unique expression of the form 


x — m - 1- a, 


where 0 ^ a < 1. Indeed, we let m be the largest integer ^ x. Then 
x < m + 1, and hence 0 5^ x — m < 1. We shall now describe a ^-decimal 
expansion for real numbers between 0 and 1. 

Theorem 4.2. Let x be a real number , 0 x < 1 . Let d be an integer 
^ 2. For each positive integer n there is a unique expression 


( 3 ) 



a n 


n 


with integers a t satisfying 0 ^ a t < d and 0 ^ a„ < \/d n . 

Proof Let m be the largest integer ^ d n x. Then m ^ 0 and 

d n x = m + u n 

with some number a„ such that 0 ^ a„ < 1. We apply Theorem 4.1 to m, 
and then divide by d n to obtain the desired expression. Conversely, given 
such an expression (3), we multiply it by d n and apply the uniqueness 
part of Theorem 4.1 to obtain the uniqueness of (3). This proves our 
theorem. 

When d= 10, the numbers a u a 2 ,... in Theorem 4.2 are precisely 
those of the decimal expansion of x, which is written 

x = 0.a 1 « 2 «3 • • • 


since time immemorial. 
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Conversely: 

Theorem 4.3. Let d be an integer ^ 2. Let a u a 2 , ... be a sequence of 
integers , 0 ^ a t < d for all i, and assume that given a positive integer N 
there exists some n ^ N such that a n / d — 1. Then there exists a real 
number x such that for each n ^ l we have 


x = 


Ol 

d 




n’ 


where oc n is a number with 0 ^ a„ < \/d n . 

Proof We shall use freely some simple properties of limits and infinite 
sums, treated in any standard beginning course of analysis. Let 


Then the sequence y u - is increasing, and easily shown to be 
bounded from above. Let x be its least upper bound. Then x is a limit 
of the sequence, and 


where 


Let 


* = JV, + a„. 



«v 

d v ' 


fn= 1 

V = n + 1 


d ~± 

d v 


By hypothesis, we have <x n < f n because there is some a x with v ^ n + 1 
such that a v / d — 1. On the other hand, 


d- 1-1 d-i 1 1 

d nTi ~ v = 0 d v ~ d nTV 1 _ 1 “ J n ’ 


Hence 0 ^ oc n < 1 / d n , as was to be proved. 

Corollary 4.4. The real numbers are not denumerable. 

Proof Consider the subset of real numbers consisting of all decimal 
sequences 
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with 1 ^ a ( ^ 8, taking d = 10 in Theorems 4.2 and 4.3. It will suffice to 
prove that this subset is not denumerable. Suppose it is, and let 

(X 1 = 0. ^11^12^13 • • • 5 

a 2 = 0 . U 2 iU 22 ^23 * ' * 9 
a 3 = 0-^3i a 32 a 33-*‘? 


be an enumeration of this subset. For each positive integer n , let b n be 
an integer with 1 ^ b n ^ 8 such that b n / a nn . Let 

P ~ 0 . b 1 b 2 b^ • • • b n . . . . 

Then /? is not equal to a„ for all n. This proves that there cannot be an 
enumeration of the real numbers. (Note: The simple facts concerning the 
terminology of denumerable sets used in this proof will be dealt with 
systematically in the next chapter.) 


IX, §5. THE COMPLEX NUMBERS 

Our purpose in this section is to identify the real numbers with a sub- 
field of some field in which the equation t 2 = — 1 has a root. As is usual 
in these matters, we define the bigger field in a way designed to make 
this equation obvious, and must then prove all desired properties. 

We define a complex number to be a pair (x, y ) of real numbers. We 
define addition componentwise. If z = (x, y ) we define multiplication of z 
by a real number a to be 


az = ( ax , ay). 

Thus the set of complex numbers, denoted by C, is nothing so far but 
R 2 , and can be viewed already as a vector space over R. We let 
e = (l,0) and i = ( 0,1). Then every complex number can be expressed 
in a unique way as a sum xe + yi with x, ye R. We must still define the 
multiplication of complex numbers. If z = xe + yi and w = ue + vi are 
complex numbers with x, y, w, ueR we define 

zw = (xu — yv)e + (xv + yu)i. 

Observe at once that ez = ze = z for all zeC, and i 2 = —e. We now 
contend that C is a field. We already know that it is an additive 
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(abelian) group. If z Y = x t e + yj, z 2 = x 2 e + y 2 i, and z 3 = x 3 e + y 3 i, 
then 

(z l z 2 )z 3 = ((x x x 2 - y^yje 4- (y { x 2 4- x i y 2 )i)(x 3 e 4- y 3 0 

= (*1*2*3 - y l3^2 *3 - y 1*2 y 3 - *1^3)^ 

4- (y 1 *2*3 4- x 1 y 2 x 3 4- x l x 2 y 3 — y\y 2 yz)i> 

A similar computation of z l (z 2 z 3 ) shows that one gets the same value as 
for (ziZ 2 )z 3 . Furthermore, letting w = ue 4- vi again, we have 

w(z x 4- z 2 ) = (ue 4- iT)((*i + x 2 )e 4- (yi 4- 42 )*) 

= (u(x : 4- X 2 ) - v(y i 4- y 2 ))e 4- (v(x l 4- x 2 ) + u(y { 4- y 2 ))i 
= (ux l — vy { 4- ux 2 — vy 2 )e 4- (i;x L 4- uy { 4- vx 2 4- uy 2 )i. 

Computing wz 1 4- wz 2 directly shows that one gets the same thing as 
w(Zi + z 2 ). We also have obviously wz = zw for all w, zeC, and hence 
(z x 4- z 2 )w = z x w + z 2 w. This proves that the complex numbers form a 
commutative ring. 

The map x 1 — > (x, 0) is immediately verified to be an injective homo- 
morphism of R into C, and from now on, we identify R with its image in 
C, that is we write x instead of xe for xeR. 

If z = x + iy is a complex number, we define its complex conjugate 

z = x — iy. 

Then from our multiplication rule, we see that 

zz = x 2 4- y 2 . 

If z / 0, then at least one of the real numbers x or y is not equal to 0, 
and one sees that 




x 2 + y 2 


is such that zX = Xz = 1, because 

z zz 

Z x 2 + y 2 x 2 4- y 2 

Hence every non-zero element of C has an inverse, and consequently C 
is a field, which contains R as a subfield [taking into account our identi- 
fication of x with (x, 0)]. 
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We define the absolute value of a complex number z = x + iy to be 

|z| = \/* 2 + y 2 

and in terms of the absolute value, we can write the inverse of a non- 
zero complex number z in the form 



If z, w are complex numbers, it is easily shown that 

|z + w | ^ | z | + | wj and |zw|=|z||w|. 

Furthermore, z + w = z + w and zw — zw. We leave these properties 
as exercises. We have thus brought the theory of complex numbers to 
the point where the analysts take over. 

In particular, using the exponential function, one proves that every 
positive real number r has a real n- th root, and that in fact any complex 
number w has an n- th root, for any positive integer n. This is done by 
using the polar form, 

w = re 10 

with real 0, in which case r 1/n e l0,n is such an n- th root. 

Aside from this fact, we shall use that a continuous real-valued func- 
tion on a closed, bounded set of complex numbers has a maximum. All 
of this is proved in elementary courses in analysis. 

Using these facts, we shall now prove that: 


Theorem 5.1. The complex numbers are algebraically closed , in other 
words , every polynomial f eC[t] of degree ^ 1 has a root in C. 

We may write 

f(t) = a n t n + a n _ l t n 1 + • • • + a 0 
with a n / 0. For every real R > 0, the function |/| such that 

t^\m\ 

is continuous on the closed disc of radius R , and hence has a minimum 
value on this disc. On the other hand, from the expression 


m=aAX+ ±i + ... + _‘ 
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we see that when |t| becomes large, then 1/(0 1 also becomes large, i.e. 
given C > 0 there exists R > 0 such that if \t\ > R then |/(0I > C. Con- 
sequently, there exists a positive number R 0 such that, if z 0 is a mini- 
mum point of | /| on the closed disc of radius R 0i then 

\m^\f(z 0 )\ 

for all complex numbers t. In other words, z 0 is an absolute minimum 
for \ f\. We shall prove that /(z 0 ) = 0. 

We express / in the form 


fit) = C 0 + c x (t - z 0 ) + ••• + c„(t - z 0 f 


with constants c f . If /(z 0 ) # 0, then c 0 = /(z 0 ) ^ 0. Let z = r — z 0 , and 
let m be the smallest integer > 0 such that c m / 0. This integer m exists 
because / is assumed to have degree ^ 1. Then we can write 

f(t) = Mz) = c 0 + c m z m + z m+ 'g(z) 

for some polynomial g , and some polynomial f x (obtained from / by 
changing the variable). Let z x be a complex number such that 

Z™ = -Co/Cm, 

and consider values of z of type 


z = Xz l9 


where X is real, 0 ^ X ^ 1. We have 

m = fMz i) = C 0 - X m c 0 + X m+1 z^ + 1 g(Xz i ) 

= c 0 [1 -r + l m+i zZ + l Co i gttz 1 )l 

There exists a number C > 0 such that for all X with 0 ^ X ^ 1 we have 
\ z i + 1 Cq l g(Xz l )\ ^ C, and hence 

|/i(Azi)|^|c 0 |(l -r + CX m+1 ). 

If we can now prove that for sufficiently small A with 0 < X < 1 we have 

o<i-r + cx m+i < i, 

then for such X we get \fi(Xz i )\ < |c 0 |, thereby contradicting the hypothe- 
sis that |/(z 0 )| ^ |/(0I for all complex numbers t. The left inequality is 
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of course obvious since 0 < X < 1 . The right inequality amounts to 
Ck m + l < or equivalently CA < 1, which is certainly satisfied for suffi- 
ciently small L This concludes the proof. 

Remark. The idea of the proof is quite simple. We have our poly- 
nomial 


/l(z) = C 0 + c m z m + z m+ l g(z\ 

and c m 7 ^ 0. If g = 0, we simply adjust c m z m so as to subtract a term in 
the same direction as c 0 , to shrink it towards the origin. This is done by 
extracting the suitable m-th root as above. Since g ^ 0 in general, we 
have to do a slight amount of analytic juggling to show that the third 
term is very small compared to c m z m , and that it does not disturb the 
general idea of the proof in an essential way. 


IX, §5. EXERCISES 

1. Assuming the result just proved about the complex numbers, prove that every 
irreducible polynomial over the real numbers has degree 1 or 2. [Hint: Split 
the polynomial over the complex numbers and pair off complex conjugate 
roots.] 

2. Prove that an irreducible polynomial of degree 2 over R, with leading coeffi- 
cient 1, can be written in the form 


with a , be R, b > 0. 


(t - a) 2 + b 2 


CHAPTER X 


Sets 


X, §1. MORE TERMINOLOGY 

This chapter is the most abstract of the book, and is the one dealing 
with objects having the least structure, namely just sets. The remarkable 
thing is that interesting facts can be proved with so little at hand. 

We shall first define some terminology. Let S and / be sets. By a fam- 
ily of elements of S, indexed by /, one means simply a map /: I — ► S. 
However, when we speak of a family, we write f(i) as f h and also use 
the notation {fi} ie i to denote the family. 

Example 1. Let S be the set consisting of the single element 3. Let 
/ = {l,...,n} be the set of integers from 1 to n. A family of elements of 
S, indexed by /, can then be written {a i } 1 = 1 „ with each a t = 3. Note 

that a family is different from a subset. The same element of S may 
receive distinct indices. 

A family of elements of a set S indexed by positive integers, or non- 
negative integers, is also called a sequence. 

Example 2. A sequence of real numbers is written frequently in the 
form 

° r {*«}»§ 1 

and stands for the map /: Z + -^R such that f(i) = x i . As before, note 
that a sequence can have all its elements equal to each other, that is 

{1, 1, I,--.} 

is a sequence of integers, with = 1 for each ie Z + . 
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We define a family of sets indexed by a set / in the same manner, that 
is, a family of sets indexed by / is an assignment 

i i — > S t 

which to each iel associates a set S t . The sets S t may or may not have 
elements in common, and it is conceivable that they may all be equal. 
As before, we write the family {S,} Ig7 . 

We can define the intersection and union of families of sets, just as for 
the intersection and union of a finite number of sets. Thus, if {S t } ie/ is a 
family of sets, we define the intersection of this family to be the set 

iel 


consisting of all elements x which lie in all S t . We define the union 

U s >- 


to be the set consisting of all x such that x lies in some S t . 

If S, S' are sets, we define S x S' to be the set of all pairs (x, y) with 
xeS and ye S'. We can define finite products in a similar way. If 
S l5 S 2 ,... is a sequence of sets, we define the product 


ns t 


i = 1 


to be the set of all sequences (x 1 ,x 2 ,...) with x^eS,. Similarly, if / is an 
indexing set, and {Sj /6/ a family of sets, we define the product 

ns, 

iel 


to be the set of all families {x I } lG/ with x^eS*. 

Let X , Y, Z be sets. We have the formula 

(Xu Y) x Z = (X x Z)u(Yx Z). 

To prove this, let (w, z)e(X u Y) x Z with weXuY and zeZ. Then 
xv eX or weY. Say weX. Then (w, z)eX x Z. Thus 

(X u Y) x Z c= (X x Z) u ( Y x Z). 

Conversely, X x Z is contained in (X u Y) x Z and so is Y x Z. Hence 
their union is contained in (X u Y) x Z, thereby proving our assertion. 
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We say that two sets X, Y are disjoint if their intersection is empty. 
We say that a union X u Y is disjoint if X and Y are disjoint. Note that 
if X, Y are disjoint, then ( X x Z) and (Y x Z) are disjoint. 

We can take products with unions of arbitrary families. For instance, 
if {X i ] ieI is a family of sets, then 


If the family {A r J I - 6/ is disjoint (that is X^Xj is empty if i^j for 
i, jel\ then the sets X t x Z are also disjoint. 

We have similar formulas for intersections. For instance, 


We leave the proof to the reader. 

Let A" be a set and Y a subset. The complement of Y in X, denoted by 
V X Y, or X — Y, is the set of all elements xeX such that Y. If 7, Z 
are subsets of X , then we have the following formulas: 


These are essentially reformulations of definitions. For instance, suppose 
xeX and x^(YuZ). Then x$Y and x£Z. Hence xe^fn^Z. 
Conversely, if xe^ x Y nte X Z, then x lies neither in Y nor Z, and hence 
xe^(fuZ). This proves the first formula. We leave the second to the 
reader. Exercise: Formulate these formulas for the complement of the 
union of a family of sets, and the complement of the intersection of a 
family of sets. 

Let A, B be sets and f: A -> B a mapping. If Y is a subset of B , we 
define / _ 1 (Y) to be the set of all xeA such that f(x)e Y. It may be that 
/ _ 1 (Y) is empty, of course. We call / _ 1 (Y) the inverse image of Y 
(under /). If / is injective, and Y consists of one element y, then 
/ -1 ({y}) either is empty, or has precisely one element. We shall give 
certain simple properties of the inverse image as exercises. 


1. If /: A -> B is a map, and Y, Z are subsets of B , prove the following formulas: 



(X nY)xZ = (X xZ)n(Y xZ). 


& x (YuZ) = V x YnV x Z, 
V x (YnZ) = V x Yu V X Z. 
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f- l (YuZ)=f-\Y)vf~ 1 (Z), 
f~ 1 (YnZ) = f ~ 1 (Y) nf~ l (Z). 
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2. Formulate and prove the analogous properties of Exercise 1 for families of 
subsets, e.g. if {yj ie/ is a family of subsets of B , show that 


3. Let /: A -> B be a surjective map. Show that there exists an injective map of 


In order to deal efficiently with infinitely many sets simultaneously, one 
needs a special axiom. To state it, we need some more terminology. 

Let S be a set. A partial ordering (also called an ordering) of S is a 
relation, written x ^ y, among some pairs of elements of S, having the 
following properties, 

PO 1. We have x ^ x. 

PO 2. J/x^ y and y ^ z then x ^ z. 

PO 3. If x ^ y and y ^ x then x = y. 

Note that we don’t require that the relation x ^ y or y ^ x hold for 

every pair of elements (x, y) of S. Some pairs may not be comparable. 

We sometimes write y ^ x for x ^ y. 

Example 1. Let G be a group. Let S be the set of subgroups. If 
H, H’ are subgroups of G, we define 


if H is a subgroup of H\ One verifies immediately that this relation de- 
fines a partial ordering on S. Given two subgroups H, H' of G, we do 
not necessarily have H ^ H' or H' ^ H. 

Example 2. Let R be a ring, and let S be the set of left ideals of R. 
We define a partial ordering in S in a way similar to the above, namely 
if L, L' are left ideals of R , we define 



B into A. 
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H < H’ 


L<L 


if L a L'. 
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Example 3. Let X be a set, and S the set of subsets of X. If Y, Z 
are subsets of X , we define Y ^ Z if Y is a subset of Z. This defines a 
partial ordering on S. 

In all these examples, the relation of partial ordering is said to be that 
of inclusion. 

In a partially ordered set, if x ^ y and x ^ y we then write x < y. 

Remark. We have not defined the word “relation”. This can be done 
in terms of sets as follows. We define a relation between pairs of ele- 
ments of a set A to be a subset R of the product A x A. If x, ye A and 
(x, y)eR, then we say that x, y satisfy our relation. Using this formula- 
tion, we can restate our conditions for a partial ordering relation in the 
following form. For all x, y, zeA : 

PO 1. We have (x, x) e R. 

PO 2. If (x, y) e R and (y, z) e R then (x, z) e R . 

PO 3. If (x, y) e R and (y, x) e R then x = y. 

The notation we used previously is, however, much easier to use, and 
having shown how this notation can be explained only in terms of sets, 
we shall continue to use it as before. 

Let A be a partially ordered set, and B a subset. Then we can define 
a partial ordering on B by defining x ^ y for x, yeB to hold if and only 
if x ^ y in A. In other words, if R a A x A is the subset of A x A defin- 
ing our relation of partial ordering in A, we let R 0 = R n (B x B ), and 

then R 0 defines a relation of partial ordering in B. We shall say that R 0 
is the partial ordering on B induced by R, or is the restriction to B of the 
partial ordering of A. 

Let S be a partially ordered set. By a least element of S (or a 
smallest element) one means an element a e S such that a ^ x for all 
xeS. Similarly, by a greatest element one means an element b such 
that x ^ b for all xeS. 

By a maximal element m of S one means an element such that if xeS 
and x ^ m, then x = m. Note that a maximal element need not be a 
greatest element. There may be many maximal elements in S, whereas if 
a greatest element exists, then it is unique (proof?). 

Let S be a partially ordered set. We shall say that S is totally ordered 
if given x, y e S we have necessarily x ^ y or y ^ x. 

Example 4 The integers Z are totally ordered by the usual ordering. 
So are the real numbers. 

Let S be a partially ordered set, and T a subset. An upper bound of T 
(in S ) is an element beS such that x ^ b for all xe T. A least upper 
bound of T in S is an upper bound b such that, if c is another upper 
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bound, then b ^ c. We shall say that S is inductively ordered if every 
non-empty totally ordered subset has an upper bound. 

We shall say that S is strictly inductively ordered if every non-empty 
totally ordered subset has a least upper bound. 

In Examples 1, 2, 3 in each case, the set is strictly inductively ordered. 
To prove this, let us take Example 1. Let T be a non-empty totally or- 
dered subset of the set of subgroups of G. This means that if //, H' e T, 
then H c= H' or H' a H. Let U be the union of all sets in T. Then: 

(1) U is a subgroup. Proof : If x, yet/, there exist subgroups //, H' eT 

such that xeH and y e H'. If, say, H a H\ then both x, yeH' 
and hence xyeH'. Hence xyeU. Also, x~ l eH\ so x~ 1 eU. 

Hence U is a subgroup. 

(2) U is an upper bound for each element of T. Proof : Every HeT 
is contained in U 9 so H ^ U for all HeT. 

(3) U is a least upper bound for T. Proof: Any subgroup of G which 
contains all the subgroups HeT must then contain their union 
U. 

The proof that the sets in Examples 2, 3 are strictly inductively 
ordered is entirely similar. 

We can now state the axiom mentioned at the beginning of the 

section. 

Zorn’s lemma. Let S be a non-empty inductively ordered set. Then 

there exists a maximal element in S. 

We shall see by two examples how one applies Zorn’s lemma. 

Theorem 2.1. Let R be a commutative ring with unit element 1 / 0. 

Then there exists a maximal ideal in R. 

(Recall that a maximal ideal is an ideal M such that M / R, and if J 
is an ideal such that M a J a R, then J = M or J = R.) 

Proof Let S be the set of proper ideals of R, that is ideals J such 

that J # R- Then S is not empty, because the zero ideal is in S. Fur- 

thermore, S is inductively ordered by inclusion. To see this, let T be a 
non-empty totally ordered subset of S. Let U be the union of all ideals 
in T. Then U is an ideal (the proof being similar to the proof we gave 
before concerning Example 1). The crucial thing here, however, is that U 
is not equal to R. Indeed, if U = R, then let/, and hence there is some 
ideal J eT such that 1 e J because U is the union of such ideals J. This 
is impossible since S is a set of proper ideals. Hence U is in S, and is 
obviously an upper bound for T (even a least upper bound), so S is 
inductively ordered, and the theorem follows by Zorn’s lemma. 
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Let V be a non-zero vector space over a field K. Let {vi} ie] be a 
family of elements of V. If is a family of elements of K, such that 

a { = 0 for all but a finite number of indices i, then we can form the sum 

Z a i D i- 

iel 

If i u ...,i n are those indices for which a { ^ 0, then the above sum is de- 
fined to be 

a h v ti + + 

We shall say that family {vi} ieI is linearly independent if, whenever we 
have a family with a t EK, all but a finite number of which are 0, 

and 

Za,t7 ( = °, 

iel 

then all a { = 0. For simplicity, we shall abbreviate “all but a finite 
number” by “almost all” We say that a family {v^^j of elements of V 
generates V if every element veV can be written in the form 

t>= Z a i V i 

iel 

for some family {a f } ie/ of elements of K , almost all a t being 0. A family 
{ujigj which is linearly independent and generates V is called a basis of 
V 

If U is a subset of V, we may view U as a family, indexed by its own 
elements. Thus if for each veU we are given an element a v eK , almost 
all a v = 0, we can form the sum 


Z a v V - 

veU 

In this way, we can define what it means for a subset of V to generate V 
and to be linearly independent. We can define a basis of V to be a sub- 
set of V which generates V and is linearly independent. 

Theorem 2.2. Let V be a non-zero vector space over the field K. Then 
there exists a basis of V. 

Proof. Let S be the set of linearly independent subsets of V. Then S 
is not empty, because for any veV , v / 0, the set {l?} is linearly indepen- 
dent. If B , B ' are elements of S, we define B ^ B' if B c= B'. Then S is 
partially ordered, and is inductively ordered, because if T is a totally 
ordered subset of S, then 

U= (JB 

BeT 
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is an upper bound for T in S. It is easily checked that U is linearly in- 
dependent. Let M be a maximal element of S, by Zorn’s lemma. Let 
veV. Since M is maximal, if v£M, the set M u {v} is not linearly in- 
dependent. Hence there exist elements a w e K (we M) and beK not all 0, 
but almost all 0, such that 


bv + Yj a ^ w = 0* 

weM 

If b = 0, then we contradict the fact that M is linearly independent. 
Hence b ^ 0, and 


V= Z ~ b ' a « w 

weM 

is a linear combination of elements of M. If veM, then trivially, v is a 
linear combination of elements of M. Hence M generates K, and is 
therefore the desired basis of V. 

Remark. Zorn’s lemma is not psychologically completely satisfactory 
as an axiom, because its statement is too involved, and one does not 
visualize easily the existence of the maximal element asserted in that 
statement. It can be shown that Zorn’s lemma is implied by the follow- 
ing statement, known as the axiom of choice: 

Let {Si} isJ be a family of sets , and assume that each Si is not empty. 
Then there exists a family of elements {xi} isI with each x i e Si. 

For a proof of the implication, see for instance Appendix 2 of my 
Algebra. 


X, §2. EXERCISES 

1. Write out in detail the proof that the sets of Examples 2, 3 are inductively 
ordered. 

2. In the proof of Theorem 2.2, write out in detail the proof of the statement: “It 
is easily checked that U is linearly independent.” 

3. Let R be a ring and E a finitely generated module over R, i.e. a module with 
a finite number of generators v^...,v n . Assume that E is not the zero module. 
Show that E has a maximal submodule, i.e. a submodule M # E such that if 
N is a submodule, M c Af c £, then M — N or N = E. 

4. Let R be a commutative ring and S a subset of R , S not empty, and 0 $S. 
Show that there exists an ideal M whose intersection with S is empty, and is 
maximal with respect to this property. We then say that M is a maximal 
ideal not meeting S. 
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5. Let A, B be two non-empty sets. Show that there exists an injective map of A 
into B , or there exists a bijective map of a subset of A onto B. [Hint: Use 
Zorn’s lemma on the family of injective maps of subsets of A into R] 


X, §3. CARDINAL NUMBERS 

Let A, B be sets. We shall say that the cardinality of A is the same as 
the cardinality of £, and write 

card(>4) = card(B), 

if there exists a bijection of A onto B. 

We say card(^) ^ card(B) if there exists an injective mapping (injec- 
tion) /: A -► B. We also write card(B) ^ card(^) in this case. It is clear 
that if card(^) ^ card(B) and card(B) ^ card(C), then 

card(>4) ^ card (C). 

This amounts to saying that a composite of injective mappings is injec- 
tive. Similarly, if card(/l) = card(B) and card(B) = card(C) then 

card(y4) = card (C). 

This amounts to saying that a composite of bijective mappings is bijec- 
tive. We clearly have card(>4) = card(v4). 

Finally, Exercise 5 of §2 shows: 

Let A , B be non-empty sets. Then we have 

card(>4) ^ card(B) or card (B) ^ card(/l). 

We shall first discuss denumerable sets. A set D is called denumerable 
if there exists a bijection of D with the positive integers, and such a bi- 
jection is called an enumeration of the set D. 

Theorem 3.1. Any infinite subset of a denumerable set is denumerable. 

One proves this easily by induction. (We sketch the proof: It suffices 
to prove that any infinite subset of the positive integers is denumerable. 
Let D = be such a subset. Then D l has a least element a 1 . Suppose 
inductively that we have defined D n for an integer n ^ 1. Let D„ + 1 be 
the set of all elements of D n which are greater than the least element of 
D n . Let a n be the least element of D n . Then we get an injective mapping 

nv-+a n 

of Z + into D, and one sees at once that this map is surjective.) 
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Theorem 3.2. Let D be a denumerable set. Then D x D is denumerable. 


Proof. 
the map 


It suffices to prove that Z + x Z + is denumerable. Consider 
(m, n) i — ► 2 m 3". 


It is an injective map of Z + x Z + into Z + , and hence Z + x Z + has the 
same cardinality as an infinite subset of Z + , whence Z + x Z + is de- 
numerable, as was to be shown. 

In this proof, we have used the factorization of integers. One can 
also give a proof without using that fact. The idea for such a proof is 
illustrated in the following diagram: 



We must define a bijection Z + ->Z + x Z + . We map 1 on (1, 1). Induc- 
tively, suppose that we have defined an injective map 

/:{1, ->Z + x Z\ 

We wish to define f(n + 1). 

If f(n) = (1, k) then we let f(n + 1) = (k + 1, 1). 

If f(n ) = (r, k) with r =£ 1 then we let f(n + 1) = (r — 1, fc + 1). 

It is then routinely checked that we obtain an injection of {l,...,n+ 1} 
into Z + x Z + . By induction, we obtain a map of Z + into Z + xZ + 
which is also routinely verified to be a bijection. In the diagram, our 
map / can be described as follows. We start in the corner (1, 1), and 
then move towards the inside of the quadrant, starting on the horizontal 
axis, moving diagonally leftwards until we hit the vertical axis, and then 
starting on the horizontal axis one step further, repeating the process. 
Geometrically, it is then clear that our path goes through every point 
(ij) of Z + x Z + . 
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Corollary 3.3. For every positive integer n , the product D x ■ x D 
taken n times is denumerable. 

Proof. Induction. 

Corollary 3.4. Let {D u D 2 , . . .} be a sequence of denumerable sets , also 
written {D i } ieZ + . Then the union 

U=\jD t 

i — 1 

is denumerable. 


Proof. For each i we have an enumeration of the elements of D i7 say 


Then the map 


D i= {a n ,a i2 ,...}. 
( Uj ) *-> a ij 


is a map from Z + xZ + into U , and is in fact surjective. Let 

/: Z + x Z + -► L 

be this map. For each aeU there exists an element xeZ + x Z + such 
that f(x) = a , and we can write this element x in the form x fl . The as- 
sociation a i— ► x fl is an injection of U into Z + x Z + , and we can now 
apply the theorem to conclude the proof. 


In the preceding proof, we used a special case of a cardinality state- 
ment which it is useful to state in general: 


Let /: A -► B be a surjective map of a set A onto a set B. Then 

card (B) ^ card(^). 

This is easily seen, because for each yeB there exists an element xeA, 
denoted by x yJ such that / (x y ) = y. Then the association y i— > x y is an 
injective mapping of B into A, whence by definition, 

card(B) ^ card(y4). 

In dealing with arbitrary cardinalities, one needs a theorem which is 
somewhat less trivial than in the denumerable case. 


Theorem 3.5 (Schroeder-Bernstein). Let A, B be sets , and suppose that 
card(^) ^ card(B), and card (B) ^ card(>4). Then 


card(>4) = card(B). 
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Proof. Let /: A -> B and g: B -> A be injective maps. We separate A 
into two disjoint sets A x and A 2 . We let A x consist of all xeA such 
that, when we lift back i by a succession of inverse maps, 

X, g-\x), f~ i °g~ 1 (x), 

then at some stage we reach an element of A which cannot be lifted 
back to B by g~ l . We let A 2 be the complement of A l9 in other words, 
the set of xeA which can be lifted back indefinitely, or such that we get 
stopped in B (i.e. reach an element of B which has no inverse image in A 
by / _1 ). Then A — A x uA 2 . We shall define a bijection h of A onto B. 

If xgA 19 we define h(x) = f(x). 

If xeA 2 , we define h(x) = g~ 1 (x) = unique element yeB such that 

g(y) = x. 

Then trivially, h is injective. We must prove that h is surjective. Let 
beB. If, when we try to lift back b by a succession of maps 

we can lift back indefinitely, or if we get stopped in B, then g(b) belongs 
to A 2 and consequently b = h(g(b)) i so b lies in the image of h. On the 
other hand, if we cannot lift back b indefinitely, and get stopped in A , 
then f~\b ) is defined (i.e. b is in the image of /), and f~\b ) lies in A v 
In this case, b = h{f~ l (b)) is also in the image of h, as was to be shown. 

Next we consider theorems concerning sums and products of cardina- 
lities. 

We shall reduce the study of cardinalities of products of arbitrary sets 
to the denumerable case, using Zorn’s lemma. Note first that an infinite 
set A always contains a denumerable set. Indeed, since A is infinite, we 
can first select an element a t eA, and the complement of {a l } is infinite. 
Inductively, if we have selected distinct elements a u ...,a n in A, the com- 
plement of is infinite, and we can select a„ + l in this comple- 

ment. In this way, we obtain a sequence of distinct elements of A , giving 
rise to a denumerable subset of A. 

Let A be a set. By a covering of A one means a set T of subsets of A 
such that the union 


IK 

Cer 


of all the elements of T is equal to A. We shall say that T is a disjoint 
covering if whenever C, C e T, and C K C', then the intersection of C and 
C is empty. 
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Lemma 3.6. Let A be an infinite set. Then there exists a disjoint cover- 
ing of A by denumerable sets. 

Proof. Let S be the set whose elements are pairs ( B , T) consisting of a 
subset B of A, and a disjoint covering of B by denumerable sets. Then S 
is not empty. Indeed, since A is infinite, A contains a denumerable set D , 
and the pair (D, { D }) is in S. If (£, T) and (£', P) are elements of S, we 
define 

( b 9 n ^ ( b \ p) 

to mean that B ci B\ and T c P. Let T be a totally ordered non-empty 
subset of S. We may write T = {(B t , T £ -)} ie/ for some indexing set /. Let 

b = and r = [jr i . 

iel iel 


If C, C e T, C / C', then there exist some indices i, j such that C e T f and 
C' e Tj. Since T is totally ordered, we have, say, 

(B h r t ) ^ (B J9 r ; ). 

Hence in fact, C, C' are both elements of T,-, and hence C, C' have an 
empty intersection. On the other hand, if xeB, then xeB ( for some i, 
and hence there is some CeT { such that xeC. Hence T is a disjoint 
covering of B. Since the elements of each are denumerable subsets of 
A, it follows that T is a disjoint covering of B by denumerable sets, so 
(B, T) is in S, and is obviously an upper bound for T. Therefore S is 
inductively ordered. 

Let (M, A) be a maximal element of S, by Zorn’s lemma. Suppose 
that M 7 * A. If the complement of M in A is infinite, then there exists a 
denumerable set D contained in this complement. Then 

(M u D, A u {£>}) 

is a bigger pair than (M, A), contradicting the maximality of (M, A). 
Hence the complement of M in A is a finite set F. Let D 0 be an ele- 
ment of A. Let D l =D 0 uF. Then D x is denumerable. Let A x be the 
set consisting of all elements of A, except D 0 , together with D v Then A x 
is a disjoint covering of A by denumerable sets, as was to be shown. 

Theorem 3.7. Let A be an infinite set , and let D be a denumerable set. 
Then 


card(y4 x D) = card (A). 
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Proof. By the lemma, we can write 


A = [jD i 

iel 


as a disjoint union of denumerable sets. Then 

A x D = (J(D. x D). 

iel 

For each iel, there is a bijection of Z) t x D on by Theorem 3.2. Since 
the sets D t x D are disjoint, we get in this way a bijection of A x D on 
A, as desired. 

Corollary 3.8. If F is a finite non-empty set, then 
card(v4 xf) = card(>4). 


Proof. We have 

card (A) ^ card (A x F) ^ card(/l x D) = card(^). 

We can then use Theorem 3.5 to get what we want. 

Corollary 3.9. Let A, B be non-empty sets, A infinite, and suppose 
card (B) ^ card(^). Then 

card(^ u B) = card(>4). 

Proof We can write Au B = A u C for some subset C of B, such that 
C and A are disjoint. (We let C be the set of all elements of B which 
are not elements of A.) Then card(C) ^ card(>4). We can then construct 
an injection of A u C into the product 

A x {1,2} 

of A with a set consisting of 2 elements. Namely, we have a bijection 
of A with A x {1} in the obvious way, and also an injection of C into 
A x {2}. Thus 

card (A uC)^ card(,4 x {1, 2}). 

We conclude the proof by Corollary 3.8 and Theorem 3.5. 

Theorem 3.10. Let A be an infinite set. Then 


card(v4 x A) = card (A). 
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Proof. Let S be the set consisting of pairs (£,/) where B is an infinite 
subset of A, and f.B^BxB is a bijection of B onto B x B. Then S is 
not empty because if D is a denumerable subset of A, we can always find 
a bijection of D onto D x D. If ( B , /) and ( B ', /') are in S, we define 
(£,/) ^ (//',/') to mean B c= B\ and the restriction of /' to B is equal to 
/. Then S is partially ordered, and we contend that S is inductively 
ordered. Let T be a non-empty totally ordered subset of S, and say T 
consists of the pairs (£*,/;) for i in some indexing set /. Let 

M = [jB i . 

iel 


We shall define a bijection g: M ->M x M. If xeM, then x lies in some 
B { . We define g(x) = ffx). This value ffx) is independent of the choice 
of Bi in which x lies. Indeed, if xeBj for some ye/, then say 

(B h fd g (Bj, fj). 

By assumption, B t a Bj , and fj(x) = //(x), so g is well defined. To show 
g is surjective, let x, yeM and (x, y) e M x M. Then x e B t for some i e / 
and yeBj for some ye/. Again since T is totally ordered, say 
(Biifi)t^(Bj,fj). Thus B i <=B j , and x, yeBj. There exists an element 
beBj such that fj(b) = (x,y)eBj x Bj. By definition, g(b) = (x, y\ so g 
is surjective. We leave the proof that g is injective to the reader to con- 
clude the proof that g is a bijection. We then see that (M, g) is an upper 
bound for T in S, and therefore that S is inductively ordered. 

Let (A/, g) be a maximal element of S, and let C be the complement 
of M in A. If card(C) ^ card(M), then 

card(A/) ^ card(>4) = card(M u C) = card(M) 

by Corollary 3.9, and hence card(M) = card(>4) by Bernstein’s theorem. 
Since card(A/) = card(A/ x A/), we are done with the proof in this case. 
If card(Af) ^ card(C), then there exists a subset Mj of C having the 
same cardinality as M. We shall prove this is not possible. We consider 

(Mu M x ) x (MuMJ 

= (M x M)u(M t x M) u(M x M^u^ x M x ). 

By the assumption on M and Corollary 3.9, the union of the last three 
sets in parentheses on the right of this equation has the same cardinality 
as M. Thus 

(M u M x ) x (M u M x ) = (M x M) u M 2 , 
where M 2 is disjoint from M x M, and has the same cardinality as M. 
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We now define a bijection 

->(MuMi) x (MuMJ. 

We let §i(x) = $(x) if xeM , and we let ^ on M l be any bijection of M l 
on M 2 . In this way we have extended § to M u M u and the pair 
(MuM 1? is in S , contradicting the maximality of The case 

card(M) ^ card(C) therefore cannot occur, and our theorem is proved. 

Corollary 3.11. If A is an infinite set , and A (n) = A x • x A is the 
product taken n times , then 

card(A (n) ) = card(/l). 


Proof. Induction. 

Corollary 3.12. If A l9 ... 9 A„ are non-empty infinite sets , and 

card (i4 f ) ^ card(/t„) 


/or i = 1, . . . ,w, 


card(/l 1 x x ,4„) = card(/l„). 


Proo/ We have 

card(A n ) ^ card(/l 1 x ••• x ^ card(/l„ x ••• x ,4„) 

and we use Corollary 3.11 and the Schroeder Bernstein theorem to con- 
clude the proof. 

Corollary 3.13. Let A be an infinite set , and let d> be the set of finite 
subsets of A. Then 


card(<I>) = card(/l). 

Proof Let d>„ be the set of subsets of A having exactly n elements, for 

each integer n = 1, 2, We first show that card(d>„) ^ card(/l). If F is 

an element of d>„, we order the elements of F in any way, say 

F = x„}, 

and we associate with F the element (x 1 ,...,x n )eA in \ 


F i-> x„). 
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If G is another subset of A having n elements, say G = {yu---,y„}, and 
G 7 * F, then 

(x l9 ... 9 x H ) # (y u ... 9 y n ). 

Hence our map 

F (x l9 ... 9 xj 

of G>„ into A (n) is injective. By Corollary 3.11, we conclude that 

card^,,) ^ card(/l). 

Now <1> is the disjoint union of the <b n for n = 1, 2,... and it is an 
exercise to show that card((D) ^ card(/l) (cf. Exercise 1). Since 

card(/4) ^ card(d>), 

because in particular, card^x) = card(/4), we see that our corollary is 
proved. 

In the next theorem, we shall see that given a set, there always exists 
another set whose cardinality is bigger. 

Theorem 3.14. Let A be an infinite set , and T the set consisting of two 
elements {0, 1}. Let M be the set of all maps of A into T. Then 

card(/l) card(M) and card(/l) # card(M). 

Proof For each xg/1 we let 

/*: A { 0 , 1 } 

be the map such that f x (x) = 1 and f x (y) = 0 if y ^ x. Then x i-> f x is 
obviously an injection of A into M, so that card(/l) ^ card (A/). Suppose 
that card(/l) = card(M). Let 

x^>g x 

be a bijection between A and M. We define a map h: A -> {0, 1} by the 
rule 

h(x) = 0 if g x (x) = 1 , 

h(x) = 1 if g x (x) = 0. 


Then certainly h =£ g x for any x, and this contradicts the assumption that 
x i— ► g x is a bijection, thereby proving Theorem 3.14. 
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Corollary 3.15. Let A be an infinite set , and let S be the set of all sub- 
sets of A. Then card(,4) ^ card(S) and card(,4) ^ card(S). 

Proof We leave it as an exercise. [Hint: If B is a non-empty subset 
of A, use the characteristic function <p B such that 

cp B (x) =1 if xe B, 

c p B (x ) = 0 if x±B. 

What can you say about the association B > cp B ?] 


X, §3. EXERCISES 

1. Prove the statement made in the proof of Corollary 3.13. 

2. If A is an infinite set, and is the set of subsets of A having exactly n 
elements, show that 

card(/t) ^ card(d> M ) 

for n ^ 1. 

3. Let A { be infinite sets for i = 1, 2,... and assume that 

cardf/tj) ^ card(/l) 

for some set A, and all i. Show that 



4. Let K be a subfield of the complex numbers. Show that for each integer 
n ^ 1, the cardinality of the set of extensions of K of degree n in C is ^ 
card(/Q. 

5. Let K be an infinite field, and E an algebraic extension of K. Show that 
card(£) = card(/Q. 

6. Finish the proof of Corollary 3.15. 

7. If A, B are sets, denote by M(A, B ) the set of all maps of A into B. If B, B 
are sets with the same cardinality, show that M(A, B) and M(A, B') have the 
same cardinality. If A, A' have the same cardinality, show that M(A, B) and 
M(A\ B) have the same cardinality. 

8. Let A be an infinite set and abbreviate card(/4) by a. If B is an infinite set, 
abbreviate card(fi) by /?. Define a/? to be card(/4 x B). Let B be a set dis- 
joint from A such that card(S) = card(B')- Define a + (1 to be card(/t u B'). 
Denote by B A the set of all maps of A into B , and denote card (B A ) by /?“ 
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Let C be an infinite set and abbreviate card(C) by y. Prove the following 
statements: 

(a) s(/j + V) = s/i + V (6) a/i = /ja (c) 0 L p+y = ^ol v (d) (a^) y = (x (fiy) . 

9. Let AC be an infinite field. Prove that there exists an algebraically closed field 
A containing AC as a sub field, and algebraic over AC. [Hint: Let Q be a set of 
cardinality strictly greater than the cardinality of AC, and containing AC. Con- 
sider the set S of all pairs (£, cp) where £ is a subset of £2 such that AC <z £, 
and cp denotes a law of addition and multiplication on £ which makes £ into 
a field such that AC is a subfield, and £ is algebraic over AC. Define a partial 
ordering on S in an obvious way; show that S is inductively ordered, and 
that a maximal element is algebraic over AC and algebraically closed. You 
will need Exercise 5 in the last step.] 

10. Let AC be an infinite field. Show that the field of rational functions AC(f ) has 
the same cardinality as AC. 

11. Let J n be the set of integers {l,...,n}. Let Z + be the set of positive integers. 
Show that the following sets have the same cardinality: 

(a) The set of all maps /V/(Z + , J„) with n ^ 2. 

(b) The set of all maps /V/(Z + , J 2 ). 

(c) The set of all real numbers x such that 0 ^ x < 1. 

(d) The set of all real numbers. 

[Hint: Use decimal expansions.] 

12. Show that M( Z+,Z+) has the same cardinality as the real numbers. 

13. Prove that the sets R, M( Z+,R), M(Z + ,Z + ) have the same cardinalities. 


X, §4. WELL-ORDERING 

A set A is said to be well-ordered if it is totally ordered, and if every 
non-empty subset B has a least element, i.e. an element aeB such that 
a ^ x for all xe B. 

Example 1. The set of positive integers Z + is well-ordered. Any finite 
set can be well-ordered, and a denumerable set D can be well-ordered: 
Any bijection of D with Z + will give rise to a well-ordering of D. 

Example 2. Let D be a denumerable set which is well-ordered. Let b 
be an element of some set, and b£D. Let A = D u {b}. We define x ^ b 
for all xeD. Then A is totally ordered, and is in fact well-ordered. 
Proof : Let B a non-empty subset of A. If B consists of b alone, then b 
is a least element of B. Otherwise, B contains some element aeD . Then 
B n D is not empty, and hence has a least element, which is obviously 
also a least element for B. 
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Example 3. Let D u D 2 be two denumerable sets, each one well- 
ordered, and assume that D l nD 2 is empty. Let A=D i uD 2 . We 
define a total ordering in A by letting x < y for all xeD 1 and all yeZ) 2 - 
Using the same type of argument as in Example 2, we see that A is 
well-ordered. 

Example 4. Proceeding inductively, given a sequence of disjoint denu- 
merable sets D u D 2 , ■■■ we let A = (J D h and we can define a well-order- 
ing on A by ordering each D t like Z + , and then defining x < y for xeD t 
and yeD i+1 . One may visualize this situation as follows: 


D i D 2 D 3 . . . 

Theorem 4.1. Every non-empty infinite set A can be well-ordered. 

Proof. Let S be the set of all pairs (X, R ) where X is a subset of A , 
and R is a total ordering of X such that X is well-ordered. Then S is 
not empty, since given a denumerable subset D of A, we can always well- 
order it like the positive integers. If (X, R) and (Y, Q) are elements of S, 
we define ( X , K) ^ ( Y, Q) if X cz Y, if the restriction of Q to X is equal 
to R , if X is the beginning segment of Y, and if every element y e Y, 
y i X is such that x < y for all x e X. Then S is partially ordered. To 
show that S is inductively ordered, let T be a totally ordered non-empty 
subset of 5, say T = \(X h K,)] (e/ . Let 

M = [JX, 

ieJ 

Let x, yeM. There exists i,jel such that xeX { and yeXj. Since T is 
totally ordered, say (X h RJ ^ (X j9 Rfi. Then both x, yeXj. We define 
x ^ y in M if x ^ y in X This is easily seen to be independent of the 
choice of ( Xj,R } ) such that x, yeXj , and it is then trivially verified that 
we have defined a total ordering on M, which we denote by (M, P). We 
contend that this total ordering on M is a well-ordering. To see this, let 
N be a non-empty subset of M. Let x 0 e N. Then there exists some i 0 e I 
such that x 0 e X io . The subset Mn X io is not empty. Let a be a least 
element. We contend that a is in fact a least element of N. Let xeJV. 
Then x lies in some X t . Since T is totally ordered, we have 

{X h R t ) (X fo , R io ) or (X io , R io ) g (X h R ( ). 

In the first case, xEX t (= X iQ and hence a ^ x. In the second case, 
if x$X io then by definition, a < x. This proves that (M, P) is a well- 
ordering. 

We have therefore proved that S is inductively ordered. By Zorn’s 
lemma, there exists a maximal element (M, P) of S. Then M is well- 
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ordered, and all that remains to be shown is that M = A. Suppose 
M # A, and let z be an element of A and z$M. Let M' = Mu {z}. We 
define a total ordering on M' by defining x < z for all xeM. Then M' is 
well-ordered, for let iV be a totally ordered non-empty subset of M'. 
If N n M is not empty, then N n M has a least element a, which is 
obviously a least element for N. This contradicts the fact that M is 
maximal in S. Hence M = A, and our theorem is proved. 

Remark. It is an elaborate matter to axiomatize the theory of sets be- 
yond the point where we have carried it in the arguments of this chapter. 
Since all the arguments of the chapter are easily acceptable to working 
mathematicians, it is a reasonable policy to stop at this point without 
ever looking at the deeper foundations. 

One may, however, be interested in these foundations for their own 
sake, as a matter of taste. We refer readers to technical books on the 
subject if they are so inclined. 


Appendix 


APP., §1. THE NATURAL NUMBERS 


The purpose of this appendix is to show how the integers can be ob- 
tained axiomatically using only the terminology and elementary proper- 
ties of sets. The rules of the game from now on allow us to use only 
sets and mappings. 

We assume given once for all a set N called the set of natural 
numbers, and a map o: N -► N, satisfying the following (Peano) axioms: 

NN 1. There is an element OeN. 

NN 2. We have cr(0) ^ 0 and if we let N + denote the subset of N con- 
sisting of all ne N, n / 0, then the map x t— ► a(x) is a bijection 
between N and N + . 

NN 3. If S is a subset of N, if 0 e S , and if o(n) lies in S whenever n 
lies in S , then S = N. 

We often denote a(n) by n' and think of n' as the successor of n. The 
reader will recognize NN 3 as induction. 

We denote cr(0) by 1. 

Our next task is to define addition between natural numbers. 

Lemma 1.1. Let /: N — ► N be maps such that 


/( 0) = g( 0) and 



Then f = g. 
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Proof. Let S be the subset of N consisting of all n such that 

f(n) = g(n). 

Then S obviously satisfies the hypotheses of induction, so S = N, thereby 
proving the lemma. 

For each me N, we wish to define m + n with ne N such that 

(l m ) m + 0 = m and m + n! = (m + n) r for all ne N. 

By Lemma 1.1, this is possible in only one way. 

If m = 0, we define 0 + n = n for all n e N. Then (1 m ) is obviously sat- 
isfied. Let T be the set of m e N for which one can define m -F n for all 
ne N in such a way that (l w ) is satisfied. Then 0 el Suppose meT. We 
define for all ne N, 


Then 


m' + 0 = m and m r + n = (m + n)'. 

m' T n! = (m 4- n')' = ((m T n)')' = (m' + n)'. 


Hence (1 OT ,) is satisfied, so m'eT. This proves that T = N, and thus we 
have defined addition for all pairs (m, n) of natural numbers. 

The properties of addition are easily proved. 

Commutativity. Let S be the set of all natural numbers m such that 
(2 m ) m + n = n + m for all ne N. 

Then 0 is obviously in S , and if meS, then 


m r + n = (m + n)' = (n 4- m)' = n -f m', 


thereby proving that S = N, as desired. 

Associativity. Let S be the set of natural numbers m such that 

(3 m ) (m + w) + fc = m + (w + fc) for all n, k e N. 

Then 0 is obviously in S. Suppose meS. Then 

(m' + n) + k = (m 4- n)' -f /c, = ((m + w) 4- fc)' 

= (m I (n + /c)) r = m' + (n 4 /c), 

thereby proving that S = N, as desired. 

Cancellation law. Let m be a natural number. We shall say that the 
cancellation law liolds for m if for all /c, neN satisfying m + k = m + n we 
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must have k = n. Let S be the set of m for which the cancellation law 
holds. Then obviously OeS, and if me S, then 


Since the mapping xkx' is injective, it follows that m + k = m + n, 
whence k = n. By induction, S = N. 

For multiplication, and other applications, we need to generalize 
Lemma 1.1. 

Lemma 1.2. Let S be a set , and (p: S — ► S a map of S into itself. Let 
f g be maps of N into S. If 


for all neN, then f = g. 

Proof Trivial by induction. 

For each natural number m, it follows from Lemma 1.2 that there is 
at most one way of defining a product mn satisfying 


We in fact define the product this way in the same inductive manner 
that we did for addition, and then prove in a similar way that this 
product is commutative , associative , and distributive , that is 


for all m, n , /ceN. We leave the details to the reader. 

In this way, we obtain all the properties of a ring, except that N is 
not an additive group: We lack additive inverses. Note that 1 is a unit 
element for the multiplication, that is 1 m = m for all me N. 

It is also easy to prove the multiplicative cancellation law , namely if 
mk = mn and m ^ 0, then k = n. We also leave this to the reader. In 
particular, if mn / 0, then m ^ 0 and n ^ 0. 

We recall that an ordering in a set X is a relation x ^ y between cer- 
tain pairs (x, y) of elements of X , satisfying the conditions (for all 

x,y, zeX): 

PO 1. We have x ^ x. 

PO 2. If x ^ y and y ^ z, then x ^ z. 

PO 3. Ifx ^ y and y ^ x, then x = y. 


m' + k = m' + n implies (m + k) f = (m + n)\ 


/( 0 ) = 0 ( 0 ) and 



m0 = 0 and mn' = mn -f m for all ne N. 


m(n + k) = mn + mk 
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The ordering is called a total ordering if given x, yeX we have x ^ y or 
y ^ x. We write x < y if x ^ y and x # y. 

We can define an ordering in N by defining n ^ m if there exists ke N 
such that m = n + k. The proof that this is an ordering is routine and 
left to the reader. This is in fact a total ordering , and we give the proof 
for that. Given a natural number m, let C m be the set of neN such that 
nf^m or mf^n. Then certainly 0 eC m . Suppose that neC m . If n = m, 
then n! = m + 1, so m ^ nl. If n < m, then m = n + k' for some ke N, so 
that 

m = n + k f = (n + k)' = ri + k, 

and n' ^ m. If m ^ n , then for some /c, we have n = m + k, so that 
H + l= m + /c+l and m ^ n - 1-1. By induction, C m = N, thereby show- 
ing our ordering is total. 

It is then easy to prove standard statements concerning inequalities, 
e-g- 

m < n if and only if m + k < n + k for some ke N, 

m < n if and only if mk < nk for some /ceN, k # 0. 

One can also replace “for some” by “for all” in these two assertions. 
The proofs are left to the reader. It is also easy to prove that if m, n 

are natural numbers and m ^ n ^ m + 1, then m = n or n = m + 1. We 

leave the proof to the reader. 

We now prove the first property of integers mentioned in Chapter I, 
§2, namely the well-ordering: 

Every non-empty subset S of N has a least element . 

To see this, let T be the subset of N consisting of all n such that 
n ^ x for all x e S. Then Oel] and T # N. Hence there exists meT 
such that m + l^T (by induction!). Then meS (otherwise m < x for all 
xeS which is impossible). It is then clear that m is the smallest element 
of 5, as desired. 

In Chapter IX, we assumed known the properties of finite cardinali- 
ties. We shall prove these here. For each natural number n ^ 0 let J n be 
the set of natural numbers x such that 1 ^ x ^ n. 

If n = 1, then J n = {1}, and there is only a single map of J l into itself. 

This map is obviously bijective. We recall that sets A, B are said to 
have the same cardinality if there is a bijection of A onto B . Since a 
composite of bijections is a bijection, it follows that if 

card(/l) = card(B) and card(B) = card(C), 


then card(/l) = card(C). 
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Let m be a natural number ^ 1 and let keJ m .. Then there is a bijec- 
tion between 

J m - {k} and J m 

defined in the obvious way: We let f: J m . — {k} be such that 

/: x h~> x if x < /c, 

if x > k. 

We let g: J m -► J m ■ — {k} be such that 

g\ x i — ► x if x < k, 

g : x i — ► a(x) if x ^ k. 

Then fog and g ° / are the respective identities, so f g are bijections. 

We conclude that for all natural numbers m ^ 1, if 

h - J m -*■ J m 

is an injection , t/zezz h is a bijection. 

Indeed, this is true for m = 1, and by induction, suppose the statement 
true for some m ^ 1. Let 

( P ■ J m’ J m’ 

be an injection. Let re J m > and let s = <p(r). Then we can define a map 

(Po-Jm' ~ M Jm- - {«} 

by x i— ► c p(x). The cardinality of each set J m . — {r} and J m > — {s} is the 
same as the cardinality of J m . By induction, it follows that cp 0 is a bijec- 
tion, whence <p is a bijection, as desired. 

We conclude that if 1 ^ m < n, then a map 
cannot be injective. 

For otherwise by what we have seen, 

foj = j m , 

and hence 

m = m 

for some x such that 1 ^ x ^ m, so / is not injective. 
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Given a set A, we shall say that card(^) = n (or the cardinality of A is 
n , or A has n elements) for a natural number 1, if there is a bijection 
of A with J n . By the above results, it follows that such a natural 
number n is uniquely determined by A. We also say that A has cardi- 
nality 0 if A is empty. We say that A is finite if A has cardinality n 
for some natural number n. It is then an exercise to prove the following 
statements: 

If A, B are finite sets, and A n B is empty, then 
card(/4) + card(B) = card(,4 u B). 

Furthermore, 

card(/l) card(B) = card(,4 x B). 

We leave the proofs to the reader. 


APP., §2. THE INTEGERS 

Having the natural numbers, we wish to define the integers. We do this 
the way it is done in elementary school. 

For each natural number n / 0 we select a new symbol denoted by 
— n , and we denote by Z the set consisting of the union of N and all the 
symbols —n for ne N, n ^ 0. We must define addition in Z. If x, yeN 
we use the same addition as before. For all xeZ, we define 


0 + x = x + 0 = x. 


This is compatible with the addition defined in §1 when xeN. 

Let m, n e N and neither n nor m — 0. If m = n + k with ke N we 
define: 

(a) m + ( — n) = ( — n) -f m = k. 

(b) ( — m) -f n = n + ( — m) = — k if /c / 0, and = 0 if k = 0. 

(c) ( — m)-\-( — n)= —(m + n). 

Given x, ye Z, if not both x, y are natural numbers, then at least one of 
the situations (a), (b), (c) applies to their addition. 

It is then tedious but routine to verify that Z is an additive group. 
Next we define multiplication in Z. If x, yeN we use the same 
multiplication as before. For all xeZ we define Ox = xO = 0. 

Let iw, «eN and neither n nor m = 0. We define: 


(-m)n = n( — m) = —(mn) and ( m)( — n) = mn. 
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Then it is routinely verified that Z is a commutative ring, and is in fact 
integral, its unit element being the element 1 in N. In this way we get 
the integers. 

Observe that Z is an ordered ring in the sense of Chapter IX, §1 be- 
cause the set of natural numbers n / 0 satisfies all the conditions given 
in that chapter, as one sees directly from our definitions of multiplication 
and addition. 


APP., §3. INFINITE SETS 

A set A is said to be infinite if it is not finite (and in particular, not 
empty). 

We shall prove that an infinite set A contains a denumerable subset. 
For each nonempty subset T of A, let x T be a chosen element of T. We 
prove by induction that for each positive integer n we can find uniquely 
determined elements x u ...,x n e A such that x l = x A is the chosen element 
corresponding to the set A itself, and for each k = 1, . . . ,n — 1, the ele- 
ment x fc + 1 is the chosen element in the complement of {x l9 . . . ,x*}. When 
n = 1, this is obvious. Assume the statement proved for n > 1. Then we 
let x„ + 1 be the chosen element in the complement of {x i ,... 9 x n }. If 
x u ...,x n are already uniquely determined, so is x n+l . This proves what 
we wanted. In particular, since the elements x l9 ...,x n are distinct for 
all n , it follows that the subset of A consisting of all elements x n is a 
denumerable subset, as desired. 
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